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Description 

[0001] The present disclosure provides engineered enzymes comprised of a protein scaffold and Specificity Deter- 
mining Regions, the production of such enzymes and the use thereof for therapeutic, research, diagnostic, nutritional 
s care, personal care and industrial purposes. 

Background 

[0002] Academic and industrial research continuously searches for functional proteins to be used as therapeutic, 
10 research, diagnostic, nutritional, personal care or industrial agents. Today, such functional proteins can be classified 

mainly into two categories: natural proteins and engineered proteins. Natural proteins, on the one hand, are discovered 

from nature, e.g. by screening natural isolates or by sequencing genomes from diverse species. Engineered proteins. 

on the other hand, are typically based on known proteins and are altered in order to acquire modified functionalities. 

Herein is disclosed engineered proteins with novel functions as compared to the starting components. Such proteins 
IS are called NBEs (New Biologic Entities). The NBEs disclosed are engineered enzymes with novel substrate specificities 

or fusion proteins of such engineered enzymes with other functional components. 

[0003] Specificity is an essential element of enzyme function. A cell consists of thousands of different, highly reactive 
catalysts. Yet the cell is able to maintain a coordinated metabolism and a highly organized three-dimensional structure. 
This is due in part to the specificity of enzymes, i.e. the selectivelO conversion of their respective substrates. Specificity 

20 is a qualitative and a quantitative property: the specificity of a particular enzyme can vary widely, ranging from just one 
particular type of target molecules to all molecular types with certain chemical substructures. In nature, the specificity 
of an organism's enzymes has been evolved to the particular needs of the organism. Arbitrary specificities with high 
value for therapeutic, research, diagnostic, nutritional or industrial applications are unlikely to be found in any organism's 
enzymatic repertoire due to the large space of possible specificities. The only realistic way of obtaining such specificities 

25 is their generation de novo. 

[0004] When comparing enzymes with binders, a paradigm of specificity is given by antibodies recognizing individual 
epitopes as small distinct structures within large molecules. The naturally occurring vast range of antibody specificities 
is attributed to the diversity generated by the immune system combined with natural selection. Several mechanisms 
contribute to the vast repertoire of antibody spedfidty and occur at different stages of immune response generation and 

30 antibody maturation (Janeway, C et al. (1999) Immunobiology, Elsevier Science Ltd.. Gariand Publishing, New York). 
Specifically, antibodies contain complementarity determining regions (CDRs) which interact with the antigen in a highly 
specific manner and allow discrimination even between very similar epitopes. The light as well as the heavy chain of 
the antibody each contribute three CDRs to the binding domain. Nature uses recombination of various gene segments 
combined with further mutagenesis in the generation of CDRs. As a result, the sequences of the six CDR loops are 

35 highly variable in composition and length and this forms the basis for the diversity of binding specificities in antibodies. 
A similar principle for the generation of a diversity of catalytic specificities is not known from nature. 
[0005] Catalysis. I.e. the increase of the rate of a specific chemical reaction, is besides binding the most important 
protein function. Catalytic proteins, i.e. enzymes, are classified according to the chemical reaction they catalyze. 
[0006] Transferases are enzymes transferring a group, for example, the methyl group or a glycosyl group, from one 

40 compound (generally regarded as donor) to another compound (generally regarded as acceptor). For example, glyco- 
syltransfe rases (EC 2.4) transfer glycosyl residues from a donor to an acceptor molecule. Some of the glyoosyltrans- 
ferases also catalyze hydrolysis, which can be regarded as transfer of a glycosyl group from the donor to water. The 
subclass is further sutxlivided into hexosyltransferases (EC 2.4.1 ), pentosyltransferases (EC 2.4.2) and those transferring 
other glycosyl groups (EC 2.4.99. Nomenclature Committee of the Intemational Union of Biochemistry and Molecular 

45 Biology (NC-IUBMB)). 

[0007] Oxidoreductases catalyze oxido-reductions. The substrate that is oxidized is regarded as hydrogen or electron 
donor. Oxidoreductases are classified as dehydrogenases, oxidases, mono- and dioxygenases. Dehydrogenases trans- 
fer hydrogen from a hydrogen donor to a hydrogen acceptor molecule. Oxidases react with molecular oxygen as hydrogen 
acceptor and produce oxidized products as well as either hydrogen peroxide or water. Monooxygenases transfer one 
50 oxygen atom from molecular oxygen to the sut)strate and one is reduced to water. In contrast, dioxygenases catalyze 
the insert of both oxygen atoms from molecular oxygen into the substrate. 

[0008] Lyases calalyze elimination reactions and thereby generate double t>onds or. in the reverse direction, catalyze 
the additions at double bonds. Isomerases catalyze intramolecular rearrangements. Ligases catalyze the formation of 
chemical bonds at the expense of ATP consumption. 
55 [0009] Finally, hydrolases are enzymes that catalyze the hydrolysis of chemical k>onds like C-O or C-N. The E.C.das- 
sification for these enzymes generally classifies them by the nature of the bond hydrolysed and by the nature of the 
substrate. Hydrolases such as lipases and proteases play an important role in nature as well in technical applications 
of biocataiysts. Proteases hydrolyse a peptide bond within the context of an oligo- or polypeptide. Depending on the 
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catalytic mechanism proteases are grouped into aspartic, serin, cysteine, metallo- and threonine proteases (Handbook 
of proteolytic enzymes. (1 998) Eds: Barret, A; Rawling, N.; Woessner, J.; Academic Press, London). This classification 
is based on the amino add side chains that are responsible for catalysis and which are typically presented in the active 
site in very similar orientation to each other. The sdssile bond of the substrate is brought into register with the catalytic 

s residues due to specific interactions between the amino add side chains of the substrate and complementary regions 
of the protease (Perona, J. & Craik, C (1995) Protein Science, 4, 337-360). The residues on the N- and C-terminal side 
of the scissile bond are usually called P-,, P2, P3 etc and P-j*, Pj', P3' and the binding pockets complementary to the 
substrate S^, S2, S3 and S-i*, S^ , S3', respectively (nomendature according to Schlechter & Berger, Biochem. Biophys. 
Res. Commun. 27 (1967) 1 57-162). The selectivity of proteases can vary widely from being virtually nonselective - e.g. 

10 the Subtilisins - over a strict preference at the position - e.g. Trypsin selectively cutting on the C-terminal side of 
arginine or lysine residues - to highly specific proteases - e.g. human tissue-type plasminogen activator (t-PA) deaving 
at the C-terminal side of the arginine in the sequence CPGRWG (Ding, L et al. (1995) Proc. Nati. Acad. Sd. USA 92. 
7627-7631: Coombs. G et al. (1996) J. Biol. Chem. 271 . 4461-4467). 

[001 0] The specificity of proteases, i.e. their ability to recognize and hydrolyze preferentially certain peptide substrates, 

15 can be expressed qualitatively and quantitatively. Qualitative spedficity refers to the kind of amino add residues that 
are accepted by a protease at certain positions of the peptide substrate. For example, trypsin and t-PA are related with 
respect to their qualitative spedfidty, since both of them require at the P^ position an arginine or a similar residue. On 
the other hand, quantitative specifidty refers to the relative number of peptide substrates that are accepted as substrates 
by the protease, or more predsely, to the relative kcat^l^ ratios of the protease for the different peptides that are accepted 

20 by the protease. Proteases that accept only a small portion of all possible peptides have a high spedfidty, whereas the 
specifidty of proteases that, as an extreme, deave any peptide substrate would theoretically be zero. 
[001 1 ] Comparison of the primary, secondary as well as the tertiary structure of proteases (Fersht, A. , Enzyme Structure 
and Mechanism. W. H. Freeman and Company, New York, 1995) allows identification of classes showing a high degree 
of conservation (Rawlings, N.D. & BarrBtt, A.J. (1997) In: Proteolysis in Cell Functions Eds. Hopsu-Havu.V.K.; Jarvinen. 

25 M.; Kirschke.H, pp. 13-21. lOS Press, Amsterdam). A widely accepted scheme for protease dassification has been 
proposed by Rawlings & Barrett (Handk>ook of proteolytic enzymes. (1998) Eds: Barret, A; Rawling. N.; Woessner, J.; 
Academic Press. London). For example, the serine proteases femily can be subdivided into structural dasses with 
chymotrypsin (dass SI ), subtilisin (class SB) and carboxypeptidase (class SC) folds, each of which indudes nonspecific 
as well as specific proteases (Rawlings, N.D. & Barrett. A. J. (1994) Methods Enzymol. 244, 19-61). This applies to other 

30 protease families analogously. An additional distinction can be made according to the relative location of the deaved 
bond in the substrate. Carboxy- and aminopeptidases cleave amino adds from the C- and N-terminus. respectively, 
while endopeptidases cut anywhere along the oligopeptide. 

[0012] Many applications would be conceivable if enzymes with a basically unlimited spectrum of specifidties were 
available. However, the use of such enzymes with high, low or any defined spedficity is currentiy limited to those which 
35 can be isolated from natural sources. The field of application for these enzymes varies from therapeutic, research, 
diagnostic, nutritional to personal care and industrial purposes. 

[0013] Enzyme additives in detergents have come to constitute neariy a third of the whole industrial enzyme market. 
Detergent enzymes indude proteinases for removing organic stains, lipases for removing greasy stains, amylases for 
removing residues of starchy foods and cellulases for restoring of smooth surface of the fiber. The best known detergent 

^ enzyme is probably the nonspedfic proteinase subtilisin. isolated from various Bacillus spedes. 

[0014] Starch enzymes, such as amylases, occupy the majority of those used in food processing. While starch enzymes 
indude products that are important for textile desizing. alcohol fermentation, paper and pulp processing, and laundry 
detergent additives, the largest application is for the production of high fiructose com syrup. The production of com syrup 
from starch by means of industrial enzymes was a successful altemative to add hydrolysis. 

^ [0015] Apart from starch processing, enzymes are used for an Increasing range of applications in food. Enzymes in 
food can improve texture, appearance and nutritional value or may generate desirable flavours and aromas. Currentiy 
used food enzymes in bakery are amylase, amyloglycosidases, pentosanases for breakdown of pentosan and reduced 
gluten production or glucose oxidases to increase the stability of dough. Common enzymes fordairy are rennet (protease) 
as coagulant in cheese production, lactase for hydrolysis of lactose, protease for hydrolysis of whey proteins or catalase 

50 for the remove! of hydrogen peroxides. Enzymes used in brewing process are the atx>ve named amylases, but also 
cellulases or proteases to darify the beer from suspended proteins. In wines and fruit juices, doudiness Is more commenly 
caused by starch and pectins so that amylases and pectinases increase yield and clarification. Papain and other pro- 
teinases are used for meat tenderizing. 

[0016] Enzymes have also been developed to aid animals in the digestion of feed. In the westem hemisphere, com 
55 is a major source of food for cattie, swine, and poultry. In order to improve the bioavailability of phosphate from com. 
phytase is commonly added (Wyss, M. et al. Biochemical characterizatton of fungal phytases (myo-inositol hexakisphos- 
phate phosphohydrolases): Catalytic properties. Applied & Environmental Microbiology 65, 367-373 (1999)). Moreover, 
phytate hydrolysis has been shown to bring atxsut improvements in digestibility of protein and absorption of minerals 
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such as calcium (Bedford. M. R, & Schulze. H. EXOGENOUS ENZYMES FOR PIGS AND POULTRY [Review]. Nutrition 
Research Reviews 11 , 91-1 14 (1998)). Another major feed enzyme is xytanase. This enzyme is particularly useful as a 
supplement for feeding stuff comprising more than about 10% of wheat bartey or rye, because of their relatively high 
soluble fiber content. Xytanases cause two important actions: reduction of viscosity of the intestinal contents by hydro- 

s lyzing the gel-like high molecular weight arabtnoxylans in feed (Murphy, T., C, Bedford, M. R. & McCracken, K. J. EfFect 
of a range of new xytanases on in vitro viscosity and on performance of broiler diets. British Poultry Science 44, SIS- 
SIS (2003)) and break down of polymers in cell wallswhich improve the bioavailability of protein and starch. 
[0017] Biotech research and development laboratories routinely use special enzymes in small quantities along with 
many other reagents. These enzymes create a significant market for various enzymes. Enzymes like alkaline phos- 

10 phatase, horseradish peroxidase and luciferase are only some examples. Thermostable DNA polymerases like Taq 
polymerase or restriction endonudeases revolutionized laboratory work. Therapeutic enzymes are a particular class of 
drugs, categorized by the FDA as biologicals, with a lot of advantages compared to other, especiaily non-biological 
pharmaceuticals. Examples for successful therapeutic enzymes are human clotting factors like factor VIII and factor IX 
for human treatment. In addition, digestive enzymes are used for various deficiencies in human digestive processes. 

IS Other examples are t-PA and streptokinase for the treatment of cardiovascular disease, beta-glucocerebrosidase for 
the treatment of Type I Gaucher disease, L-asparaginase for the the treatment of acute lymphoblastic leukemia and 
DNAse for the treatment of cystic fibrosis. An important issue in the application of proteins as therapeutics is their potential 
immunogenicity. To reduce this risk, one would prefer enzymes of human origin, which narrows down the set of available 
enzymes. The provision of designed enzymes, preferably of human origin, with novel, tailor-made specificities would 

20 allow the specific modification of target substrates at will, while minimizing the risk of immunogenicity. A further advantage 
of highly specific enzymes as therapeutics would be their lower risk of side effects. Due to the limited possibility of specific 
interactions between a small molecule and a protein, binding to non-target proteins and therefore side effects are quite 
common and often cause termination of an otherwise promising lead compound. Specific enzymes, on the other hand, 
provide many more contact sites and mechanisms for substrate discrimination and therefore enable a higher specificity 

25 and thereby less side activities. 

[0018] Proteases represent an important dass of therapeutic agents (Drugs of today, 33. 641-648 (1997)). However, 
currently the therapeutic protease is usually a substitute for insufficient acitivity of the body's own proteases. For example, 
factor VII can be administered in certain cases of coagulation deficiencies of bleeders or during surgery (Heuer L.; 
Blumenberg D. (2002) Anaesthesist 51:38S). Tissue-type plasminogen activator (t-PA) is applied in acute cardiac inf- 

30 arction, initializing the dissolution of fibrin clots through specific cleavage and activation of plasminogen (Verstraete, M. 
et al. (1995) Drugs, 50, 29-41). So far a protease with taylor-made specificity is generated to provide a therapeutic agent 
that specifically activates or inactivates a disease related target protein. 

[001 9] Monoclonal antibodies represent another important biological class of substances with therapeutic capabilities. 
One of the main antit)ody targets are tumor necrosis factors (TNFs) which belong to the family of cytokines. TNFs play 

3S a major role in the inflammation process. As homotrimers they could bind to receptors of neariy every cell. They activate 
a multiplicity of cellular genes, multiple signal transduction mechanisms, kinases and transcription factors. The most 
important TNFs are TNF-alpha and TNF-beta. TNF-alpha is produced by macrophages, monocytes and other cells. 
TNF-alpha is an inflammation mediator. Therefore, research of the last decade has been focused on TNF-alpha inhibitors 
like monoclonal antibodies as possible therapeutics for different therapeutic indications tike Rheumatoid Arthritis, Crohn's 

^ disease or Psoriasis (Hamilton etal. (2000) Expert OpinPharmacother, 1 (5): 104 1-1 052). One of tiie major disadvantages 
of monoclonal antibodies are their high costs, so that new biological altematives are of great importance. 
[0020] There are a lot of examples for engineered enzymes in literature. Fulani et al. (Fulani F. et al. (2003) Protein 
Engineering 16, 515-519) descrik>e a rhodanase (thiosulfat:cyanide sulfurtransferase) from Azotobacter vinelandii which 
has a catalytic domain structurally related to catalytic subunit of Cdc25 phosphatase enzymes. The difference in catalytic 

^ mechanism depends on the different size of the active site. Both rhodanase and phosphatase are highly specific on 
different substrates (sulfate vs. phosphate). The catalytic mechanism of the rtiodanase could be shifted towards serine/ 
threonine phosphatase by single-residue insertion. Therefore, Fulani et al. give a single example for the change of a 
catalytic mechanism by structural comparison and sequence alignment of naturally known enzymes from different enzyme 
classes but lack an indication of how to generate a user-definable substrate specificity while keeping the same catalytic 

so mechanism. 

[0021] The thioredoxin reductase described by Briggs et al. (WO 02/090300 A2) has an altered cofector specificity 
which preferably binds NAOPH compared to NADH. Thus, both enzymes, the starting point as well as the resulting 
engineered enzyme are highly specific towards different substrates. The methods to achieve such an altered substrate 
specificity are either computational processing methods or sequence alignments of related proteins to define variable 
ss and conserved residues. They all have in common that they are based on the comparison of structures and sequences 
of proteins with known specificities followed by the transfer of the same to another backbone. 

[0022] There are other examples of specificity-engineered enzymes and, in particular, of proteases which have been 
published in the literature. None of these examples, however, provides a means for generating novel spedfidtes com- 
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pared to the specificity of the starting material used within the described methods. The methods range from structure- 
directed single point mutations (Kurth, T. et at. (1998) Biochemistry 37, 11434-11440; Ballinger, M et ai. (1996) Bio- 
chemistry, 35:13579-13585), exchange of surface loops between two specific proteases (Horrevoets et al. (1993) J. 
Biol. Chem. 268, 779-782), to random mutagenesis either reglo-selectively or across the whole gene combined with In- 

s vitro or In-vivo selection (Sices. H. & Kristie, T. (1998) Proc. Natl. Acad. Sd. USA, 95, 2828-2833). 

[0023] The rational design of protease specificity is limited to very few examples. This approach is severely limited by 
the insufficient understanding of the complexities that govern folding and dynamics as well as structure-function rela- 
tionships in proteins (Corey, M.J. & Corey, E. (1 996) Proc. Natl. Acad. Sd. USA. 93:1 1428-1 1434). It is therefore difficult 
to alter the primary amino acid sequence of a protease in order to change its activity or spedficity in a predictive way. 

10 In a successful example, Kurth et al. engineered trypsin to show a preference for a dibasic motive (Kurth, T. et al. (1998) 
Biochemistry, 37:1 1434-11 440). In another example, Hedstrom et al. converted the Si substrate spedfidty of trypsin to 
that of chymotrypsin (Hedstrom, L. et al. (1992) Sdence, 255:1249-1253). This is an example where a known property 
was transferred from one backt)one to another. 

[0024] Ballinger et al. (WO 96/27671) describe subtilisin variants with combination mutations (N62D/G166D, and 
IS optionally Y1 04D) having a shift of substrate specificity towards peptide or polypeptide substrates with basic amino adds 
at the PI. P2 and P4 positions of the substrate. Suitable substrates of the variant subtilisin were revealed by sorting a 
library of phage partides (substrate phage) containing five contiguous randomized residues. These subtilisin variants 
are useful for deaving fusion proteins with basic substrate linkers and processing hormones or other proteins (In vitro 
or in vivo) that contain basic deavage sites. The problems assodated with rational redesign of enzymes can partially 
20 be overcome by directed evolution (as disdosed in PCT/EP03/04864). These studies can be dassified by their expression 
and seledion systems. Genetic selection means to produce inside an organism an enzyme, e.g. a protease, which is 
able to cleave a precursor protein which in tum results in an alteration of the growth behavior of the produdng organism. 
From a population of organisms with different proteases those can be selected which have an altered growth behavior. 
This prindpte was for example reported by Davis et al. (US 5258289, WO 96/21009). The production of a phage system 
25 is dependent on the deavage of a phage protein which only can be activated in the presence of a proteolytic enzyme 
which is able to deave the phage protein. Other approaches use a reporter system which allows a selection by screening 
instead of a genetic selection, but also cannot overcome the intrinsic insuffidency of the intracellular characterization of 
enzymes. 

[0025] Systems to generate enzymes with altered sequence specifidties with self-secreting enzymes are also reported. 

30 Duff et al. (WO 98/11237) describe an expression system for a self-secreting protease. An essential element of the 
experimental design is that the catalytic reaction acts on the protease itself by an autoproteolytic processing of the 
membrane-tx>und precursor molecule to release the matured protease from the cellular membrane into the extracellular 
environment. Therefore, a fusion protein must be constructed where the target peptide sequence replaces the natural 
cleavage site for autoproteolysis. Limitations of such a system are that positively identified proteases will have the ability 

35 to deave a certain amino acid sequence but they also may cleave many other peptide sequences. Therefore, high 
substrate spedfidty can not be achieved. Additionally, such a system is not able to control that selected proteases cleave 
at a spedfic position in a defined amino add sequence and it does not allow a predse characterization of the kinetic 
constants of the selected proteases (k^^* ^m)- 

[0026] A method has been described that aims at the generation of new catalytic activities and spedfidties within the 

^ o/p-ban^et proteins (WO 01/42432; Fersht et al. Methods of producing novel enzymes; Altamirano et al. (2000) Nature 
403, 61 7-622). The o/^barrel proteins comprise a large superfamily of proteins accounting for a large fraction of all 
known enzymes. The structure of the proteins is made from a/p-barrel surrounded by a-helices. The loops connecting 
P-strands and helices comprise the so-called lid-structure induding the adtve site residues. The method is based on 
the dassification of o/p-barrel proteins into two classes based on the catalytic lid structure. An extensive comparison of 

45 o/p-barrel protein structures led the authors to the condusion that the substrate binding and specifidty is primarily defined 
by the barrel strudure while the specifidty of the chemical reaction resides within the loops. It is suggested that ban'els 
and lid structures from different enzymes can be combined to generate new enzymatic activities and to provide a starting 
point to fine tune the properties by targeted or randomized mutagenesis and selection. The method does not provide 
for the generation of user-defined spedficity. 

50 [0027] In summary, it is dear that there are many possible applications in the fields of therapeutics, research and 
diagnostics, industrial enzymes, food and feed processing, cosmetics and other areas that would become possible by 
the availability of enzymes with a novel sut)strate spedfidty. However, only a limited number of spedfic enzymes has 
been identified from natural sources so far. Methods of rational design to modify, alter, convert or transfer sequence 
specifidty as well as random approaches described akx)ve did not enable the generation of a novel and user-defma- 

55 blespecifidty that was not present in the employed starting material. 

[0028] Therefore, none of the currentiy available methods can provide enzymes with a novel and user-defined sequence 
specifidty. In contrast, the current invention provides such enzymes as well as methods for generating them. 
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Summary of the Invention 

[0029] The objective is to provide engineered proteins with novel functions that do not exist in the components used 
for the engineering of such proteins. In particular, the disclosure provides enzymes with user-definable specificities. 

5 User-definable specificity means that enzymes are provided with specificities that do not exist in the components used 
for the engineering of such enzymes. The specificities can be chosen by the user so that one or more intended target 
substrates are preferentially recognised and converted by the enzymes. Furthermore, the disclosure provides enzymes 
that possess essentially identical sequences to human proteins but have different specificities. In a particular embodiment, 
the disclosure provides proteases with user-definable specificities. 

10 [0030] Furthermore, the present disclosure is directed to engineered enzymes which are fused to one or more further 
functional components. These further components can be proteinacious components which preferably have binding 
properties and are of the group consisting of substrate binding domains, antibodies, receptors or fragments thereof. 
Furthermore, these further components can be further functional components, preferably being selected from the group 
consisting of polyethylenglycols, cart>ohydrates, lipids, fatty acids, nucleic adds, metals, metal chelates, and fragments 

IS or derivatives thereof. The resulting fusion proteins are understood as enzymes with user-definable specificities. 

[0031] Besides, the disclosure is directed to the application of such enzymes with novel, user-definable specificities 
for therapeutic, research, diagnostic, nutritional, personal care or industrial purposes. Moreover, the disclosure is directed 
to a method for generating engineered enzymes with user-definable specificities. In particular, the disclosure is directed 
to generate enzymes that possess essentially identical sequences to human enzymes but have different specificities. 

20 [0032] This problem has been solved by the embodiments specified in the description below and in the claims. The 
present disclosure is thus directed to 

(1) a proteolytic enzyme with catalytic activity of defined specificity not conferred by the protein scaffold and char- 
acterized by a combination of the following components: 

25 

(a) a protein scaffold having at least 90% honnology to human trypsin I having the amino acid sequence shown 
in SEQ ID NO:1 , and being capable to catalyze at least one peptide cleavage on at least one target peptide 
substrate, and 

(b) one or more specificity determining regions inserted or substituted with the protein scaffold at sites in the 
30 protein scaffold that enable the resulting proteolytic enzyme to distinguish the target substrate at as many sites 

as are necessary to preferentially hydrolyse the target substrate versus one or more other substrates and 
wherein the specificity determining regions are inserted or substituted at one or more positions from the group 
of positions that correspond structurally or by amino acid sequence homology to the regions 18-25, 38-48, 
54-63, 73-86, 122-130, 148-156, 165-171 and 194-204 in human trypsin I having the amino add sequence 
35 shown in SEQ ID NO: 1 , and wherein the specificity determining regions are peptide sequences having a length 

of less than 50 amino add residues; 

(2) the use of a proteolytic enzyme as defined in (1) ak>ove for therapeutic, research, diagnostic, nutritional, personal 
care or industrial purposes; 

40 (3) a method for generating a proteolytic enzyme as defined in (1) above having defined specificity towards at least 

one target substrate, such specifidty not being present in the individual starting components, comprising at least 
the following steps: 

(a) providing a protein scaffold having at least 90% homology to human trypsin I having the amino add sequence 
45 shown in SEQ ID NO:1 , which catalyzes at least one chemical reaction on at least one target substrate. 

(b) generating a library of proteolytic enzymes or isolated proteolytic enzymes by combining a polynudeotide 
encoding the protein scaffold from step (a) via insertion or substitution with 1 to 1 1 fully or partially random 
synthetic oligonudeotide sequences encoding peptide sequences with a length of less than 50 amino acid 
residues at one or more positions from the group of positions within the polynudeotide encoding protein scaffold 

so that correspond structurally or by amino add sequence homology to the regions 18-25. 38-48, 54-63, 73-86, 

122-130, 148-156, 165-171 and 194-204 in human trypsin I having the amino add sequence shown in SEQ ID 
NO:1, expressing said enzymes, and (c) seleding out of the library of proteolytic enzymes generated in step 
(b) one or more enzymes that have defined spedfidties not conferred by the protein scaffold provided in step 
(a) towards at least one target substrate; 

55 

(4) a fusion protein which is comprised of at least one proteolytic enzyme as defined in (1) above and 

(i) at least one further proteinacious component, preferably being selected firom the group consisting of binding 



8 



EP 1 633 865 B1 

domains, receptors, antibodies, regulation domains, pro-sequences, and fragments thereof, and/or 

(ii) at least one further functional component, preferably being selected from the group consisting of potyethyl- 

englycols, carbohydrates, lipids, fatty adds, nucleic acids, metals, and metal chelates; 

^ (5) a composition or phamnaceutlcal composition comprising one or more proteolytic enzymes as defined in (1) 

above or a fusion protein as defined in (4) above, said pharmaceutical composition may optionally comprise an 
acceptable carrier, excipient and/or auxiliary agent; 

(6) a nudeic acid encoding a proteolytic enzyme as defined in (1) above or a fusion protein as defined in (4) above; 

(7) a vector comprising the nudeic add as defined in (6) above; 

^0 (8) a host cell or transgenic organism being transfbrmed/transfected with a vector as defined in (7) above or comprising 

the nudeic add as defined in (6) above; and 

(9) a method for produdng the proteolytic enzyme as defined in (1 ) above or a fusion protein as defined in (4) ak>ove 
comprising culturing a cell or organism as defined in (8) atx>ve, and optionally isolating the enzyme from the culture 
broth. 

15 

Brief description of the Figures 

[0033] The following figures are provided in order to explain further the present invention in supplement to the detailed 
description: 

20 

Figure 1 illustrates the three-dimensional structure of human trypsin I with the active site residues shown in "ball- 
and-stick" representation and with the mariced regions indicating potential SDR insertion sites. 

Figure 2 shows the alignment of the primary amino add sequence of three members of the serine protease class 
25 S1 family: human trypsin I, human atpha-thrombin and human enteropeptidase (see also SEQ ID NOs: 1 , 5 and 6). 

Figure 3 illustrates the three-dimensional structure of subtilisin with the active site residues being shown in "ball- 
and-stick" representation and with the numbered regions indicating potential SDR insertion sites. 

30 Figure 4 shows the alignment of the primary amino add sequences of four members of the serine protease class 

S8 family: subtilisin E, furin, PC1 and PCS (see also SEQ ID NOs: 7-10). 

Figure 5 illustrates the three-dimensional structure of pepsin with the active site residues being shown in "Isall-and- 
stick" representation and with the numbered regions indicating potential SDR insertion sites. 

35 

Figures shows the alignment of the primary amino acid sequences of three memt>ers of the A1 aspartic acid protease 
family: pepsin, p-secretase and cathepsin D (see also SEQ ID NOs: 11-13). 

Figure 7: illustrates the three-dimensional strudure of caspase 7 with the active site residues being shown in "ball- 
40 and-stick" representation and with the numbered regions indicating potential SDR insertion sites. 

Figure 8: shows the primary amino add sequence of caspase 7 as a member of the cysteine protease dass 014 
family (see also SEQ ID NO: 14). 

45 Figure 9 depicts schematically the third asped of the disdosure. 

Figure 10 shows a Western blot analysis of a culture supernatant of cells expressing variants of human trypsin I 
with SDR1 and SDR2. compared to negative controls. 

50 Figure 1 1 shows the time course of the proteolytic deavage of a target substrate by human trypsin I. 

Figure 12 shows the relative activities of three variants of engineered proteolytic enzymes in comparison with human 
trypsin I on two different peptide substrates. 

55 Figure 13 shows the relative spedfidties of human trypsin I and variants of engineered proteolytic enzymes with 

one or two SDRs, respectively. 

Figure 14: shows the relative spedfidties of human trypsin I and of variants of engineered proteolytic enzymes 
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being specific for human TNF-alpha with this scafTotd on peptides with a target sequence of human TNF-alpha. 

Figure 15: shows the reduction of cytotoxicity induced by TNF-alpha when incubating the TNF-alpha with concen- 
trated supernatant from cultures expressing the engineered proteolytic enzymes being specific for human TNF-alpha. 

5 

Figure 16: shows the reduction of cytotoxicity induced by TNF-alpha when incubating the TNF-alpha with purified 
engineered proteolytic enzyme being specific for human TNF-alpha. 

Figure 17: compares the activity of engineered proteolytic enzymes being specific for human TNF-alpha with the 
10 activity of human trypsin I on two protein substrates: (a) human TNF-alpha; (b) mixture of human serum proteins. 

Figure 18: showes the specific activity of an engineered proteolytic enzyme with specificity for human VEGF. 

Definitions 

IS 

[0034] In the framework of the present invention the following terms and definitions are used. 
[0035] The term "protease" means any protein molecule that is capable of hydrotysing peptide kK>nds. This includes 
naturally-occurring or artificial proteolytic enzymes, as well as variants thereof obtained by site-directed or random 
mutagenesis or any other protein engineering method, any active fragment of a proteolytic enzyme, or any molecular 
20 complex or fusion protein comprising one of the aforementioned proteins. A "chimera of proteases" means a fusion 
protein of two or more fragments derived from different parent proteases. 

[0036] The term "substrate" means any molecule that can be converted catalytically by an enzyme. The term "peptide 
substrate" means any peptide, oligopeptide, or protein molecule of any amino acid composition, sequence or length, 
that contains a peptide bond that can be hydrolyzed catalytically by a protease. The peptide bond that is hydrolyzed is 
25 referred to as the "cleavage site". Numbering of positions in the substrate is done according to the system introduced 
by Schlechter & Berger (Biochem. Biophys. Res. Commun. 27 (1 967) 1 57-1 62). Amino add residues adjacent N-tenminal 
to the cleavage site are numbered P^, P2, P3, etc., whereas residues adjacent C-terminal to the cleavage site are 
numbered P^', P2', P3' . etc. 

[0037] The term "target substrate" descrit)es a user-defined substrate which is specifically recognized and converted 
30 by an enzyme according to the invention . The term "target peptide substrate" describes a user-defined peptide substrate. 
The term "target specificity" describes the qualitative and quantitative specificity of an enzyme that is capable of recog- 
nizing and converting a target substrate. Catalytic properties of enzymes are expressed using the kinetic parameters, 
"Km" or "Michaelis Menten constanr, "k^gn" or "catalytic rate constanr, and "k^t /K^" or "catalytic efficiency", according 
to the definitions of Michaelis and Menten (Fersht, A., Enzyme Structure and Mechanism, W. H. Freeman and Company, 
35 New York, 1 995). The term "catalytic activit/* describes quantitatively the conversion of a given substrate under defined 
reaction conditions. 

[0038] The term "specificity" means the ability of an enzyme to recognize and convert preferentially certain substrates. 
Specificity can be expressed qualitatively and quantitatively. "Qualitative specificity" refers to the chemical nature of the 
substrate residues that are recognized by an enzyme. "Quantitative specificity" refers to the number of substrates that 

40 are accepted as substrates. Quantitative specificity can be expressed by the term s, which is defined as the negative 
logarithm of the number of all accepted substrates divided by the numt^er of all possible substrates. Proteases, for 
example, that accept preferentially a small portion of all possible peptide substrates have a "high specificity". Proteases 
that accept almost any peptide substrate have a "low specificity". Definitions are made in accordance to WO 03/095670. 
Proteases with very low specificity are also referred to as "unspecific proteases". The term "defined specificity" refers 

45 to a certain type of specificity, i.e. to a certain target subtrate or a set of certain target substrates that are preferentially 
converted versus other substrates. 

[0039] The term "engineered" in combination with the term "enzyme" describes an enzyme that is comprised of different 
components and that has features not t)eing conferred by the individual components alone. 

[0040] The term "protein scaffold" or "scaffold protein" refers to a variety of primary, secondary and tertiary polypeptide 
50 structures. 

[0041] The term "peptide sequence" indicates any peptide sequence used for insertion or substitution into or combi- 
nation with a protein scaffold. Peptide sequences are usually obtained by expression from DNA sequences which can 
be synthesized according to well-established techniques or can be obtained from natural sources. Insertion, substitution 
or combination of peptide sequences with the protein scaffold are generated by insertion, substitution or combination of 
55 oligonucleotides Into or with a polynucleotide encoding the protein scaffold. The term "synthetic" in combination with the 
term "peptide sequence" refers to peptide sequences that are not present in the protein scaffold in which the peptide 
sequences are inserted or substituted or with which they are combined. 

[0042] The term "components" in combination with the term "engineered enzyme" refers to peptide or polypeptide 
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10 



IS 



20 



sequences that are combined in the engineering of such enzymes. Such components may among others comprise one 
or more protein scaffolds and one or more synthetic peptide sequences. The term "library of engineered enzymes" 
describes a mixture of engineered enzymes, whereby every single engineered enzyme is encoded by a different poly- 
nucleotide sequence. The term "gene library" indicates a library of polynucleotides that encodes the library of engineered 
enzymes. The term "SDR" or "Specificity determining region" refers to a synthetic peptide sequence that provides the 
defined specificity when combined with the protein scaffold at sites that enable the resulting enzymes to discriminate 
between the target substrate and one or more other substrates. Such sites are termed "SDR sites". 
[0043] The terms "tertiary structure similar to the structure or and "similar tertiary structure" in combination with the 
terms "enzyme" or "protein" refer to proteins in which the type, sequence, connectivity and relative orientation of the 
typical secondary structural elements of a protein, e.g. alpha-helices, beta-sheets, beta-tums and loops, are similar and 
the proteins are therefore grouped into the same structural or topological class or fold. This includes proteins that have 
altered, additional or deleted structural elements of any type but otherwise unchanged topology. Examples of such 
structural classes are the TNF superfamily, the S1 fold or the S8 fold within the serine proteases, the GPCRs, or the oJ 
P-barrel fold. 

[0044] The term "positions that correspond structurally" indicates amino acids in proteins of similar tertiary structure 
that correspond structurally to each other, i.e. they are usually located within the same structural or topological element 
of the structure. Within the structural element they possess the same relative positions with respect to beginning and 
end of the structural element. If, e.g. the topological comparison of two proteins reveals two structurally corresponding 
sequences of different length, then amino acids within, e.g. 20% and 40% of the respective region lengths, correspond 
to each other structurally. 

[0045] The term "library of engineered enzymes" refers to a multiplicity of enzymes or enzyme variants, which may 
exist as a mixture or in isolated form. 

[0046] Amino acids residues are abbreviated according to the following Table 1 either in one- or in three-letter code. 



25 



Table 1 : Amino add abbreviations 



30 



35 



40 



45 



50 



55 



Abbreviations 


Amino acid 


A 


Ala 


Alanine 


C 


Cys 


Cysteine 


D 


Asp 


Aspartic acid 


E 


Glu 


Glutamic acid 


F 


Phe 


Phenylalanine 


G 


Gly 


Glycine 


H 


His 


Histidine 


1 


lie 


Isoleucine 


K 


Lys 


Lysine 


L 


Leu 


Leucine 


M 


Met 


Methionine 


N 


Asn 


Asparagine 


P 


Pro 


Proline 


Q 


Gin 


Glutamine 


R 


Arg 


Arginine 


S 


Ser 


Serine 


T 


Thr 


Threonine 


V 


Val 


Valine 


W 


Trp 


Tryptophane 


Y 


Tyr 


Tyrosine 
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Detailed description of the invention 

[0047] The present disclosure provides engineered proteins with novel functions. In particular, the disclosure provides 
enzymes with user-definable specificities. In a particular embodiment, the disclosure provides proteases with user- 

s definable specifictties. Besides, the disclosure provides applications of such enzymes with novel, user-definable specif- 
icities for therapeutic, research, diagnostic, nutritional, personal care or industrial purposes. Moreover, the disclosure 
provides a method for generating enzymes with specificities that are not present in the components used for the engi- 
neering of such enzymes. In particular, the disclosure is directed to the generation of enzymes that have sequences 
that are essentially identical to mammalian especially human enzymes but have different specificities . Moreover, the 

10 disclosure provides libraries of specific engineered enzymes with corresponding specificities encoded genetically, a 
method for the generation of libraries of specific engineered enzymes with corresponding specificities encoded geneti- 
cally, and the application of such libraries for technical, diagnostic, nutritional, personal care or research purposes. 
[0048] A first aspect discloses engineered enzymes with defined specificities. These engineered enzymes are char- 
acterized by the following components: 

IS 

(a) a protein scaffold capable of catalyzing at least one chemical reaction on a substrate, and 

(b) one or more specificity determining regions (SDRs) located at sites in the protein scaffold that enable the resulting 
engineered protein to discriminate between ar least one target substrate and one or more different substrates, 
wherein the SDRs are essentially synthetic peptide sequences. 

20 

[0049] . Preferably, such defined specificity of the engineered enzymes is not conferred by the protein scaffold. 
[0050] In principle, the protein scaffold can have a variety of primary, secondary and tertiary structures. The primary 
structure, i.e. the amino acid sequence, can be an engineered sequence or can be derived from any viral, prokaryotic 
or eukaryotic origin. For human therapeutic use, however, the protein scaffold is preferably of mammalian origin, and 
25 more preferably, of human origin. Furthermore, the protein scaffold is capable to catalyze one or more chemical reactions 
and has preferably only a low specificity. 

[0051] Preferably, derivatives of the protein scaffold are used that have modified amino add sequences that confer 
Improved characteristics for the applicability as protein scaffolds. Such improved characteristics comprise, but are not 
limited to. stability; expression or secretion yield; folding, in particular after combination of the protein scaffold with SDRs; 
30 increased or decreased sensitivity to regulators such as activators or inhibitors; immunogenicity; catalytic rate; kM or 
substrate affinity. 

[0052] The engineered enzymes reveal their quantitative specificity from the synthetic peptide sequences that are 
combined with the protein scaffold. Therefore, the engineered peptide sequences are acting as Specificity Determining 
Regions or SDRs. The number, the length and the positions of such SDRs can vary over a wide range. The number of 

35 SDRs within' the scaffold is at least one, preferably more than one, more preferably between two and eleven, most 
prefierably between two and six. The SDRs have a length between one and 50 amino add residues, preferably a length 
between one and 1 5 amino add residues, more preferably a length between one and six amino add residues. Alternatively, 
the SDRs have a length between two and 20 amino add residues, preferably a length between two and ten amino add 
residues, more preferably a length t)etween three and eight amino acid residues. 

^ [0053] The engineered enzymes can further be desribed as antibody-like protein molecules comprising constant and 
variable regions, but having a non-immunoglogultn backbone and having an active site (catalytic activity) in the constant 
region, whereby the substrate specificity of the active site Is modulated by the variable region. Preferably, as in the 
immunoglobulin structure, the variable regions are loops of variable length and composition that interact with a target 
molecule. 

45 [0054] In a particular, the engineered enzymes have hydrolase activity. In a preferred variant, the engineered enzymes 
have proteolytic activity. Particulariy preferred protein scaffolds for this variant are unspecific proteases or are parts from 
unspecific proteases or are otherwise derived from unspecific proteases. The expressions "derived from** or "a derivative 
thereof* in this respect and in the following variants and embodiments refer to derivatives of proteins that are mutated 
at one or more amino add positions and/or have a homology of at least 70%, preferably 90%. more preferably 95% and 

50 most preferably 99% to the original protein, and/or that are proteolyticaliy processed, and/or that have an altered glyc- 
osylation pattem. and/or that are covalentiy linked to non-protein substances, and/or that are fused with further protein 
domains, and/or that have C-terminal and/or N-termlnal truncations, and/or that have specific insertions, substitutions 
and/or deletions. Alternatively, "derived from" may refer to derivatives that are combinations or chimeras of two or more 
fragments from two or more proteins, each of which optionally comprises any or all of the aforementioned modifications. 

55 The tertiary structure of the protein scaffold can be of any type. Preferably, however, the tertiary structure belongs to 
one of the following structural dasses: class S1 (chymotrypsin fold of the serine proteases family), dass SB (subtilisin 
fold of the serine proteases femily), dass SC (cart30xypeptidase fold of the serine proteases family), dass A1 (pepsin 
A fold of the aspartic proteases), or dass CI 4 (caspase-1 fold of the cysteine proteases). Examples of proteases that 
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can serve as the protein scaffold of engineered proteolytic enzymes for the use as human therapeutics are or are derived 
from human trypsin, human thrombin, human chymotrypsin. human pepsin, human endothiapepsin. human caspases 
1 to 14. and/or human furin. 

[0055] The defined spedfidty of the engineered proteolytic enzymes is a measure of their ability to discriminate 

s between at least one target peptide or protein substrates and one or more further peptide or protein substrates. Preferably, 
the defined spedfidty refers to the ability to discriminate peptide or protein substrates that differ in other positions than 
the P1 site, more preferably, the defined spedficity refers to the ability to discriminate peptide or protein substrates that 
differ in other positions than the P1 site and the P1 ' site. Most preferably, the engineered proteolytic enzymes distinguish 
target peptid or protein substrates at as many sites as is necessary to preferentially hydrolyse the target substrate versus 
other proteins. As an example, a therapeutically useful engineered proteolytic enzyme applied intravenously in the human 
body should be suffidentty specific to discriminate between the target sutistrate and any other protein in the human 
serum. Preferably, such an engineered proteolytic enzyme recognizes and discriminates peptide substrates at three or 
more amino add positions, more preferably at four or more positions, and even more preferably at five or more amino 
add positions. These positions may either be adjacent or non-adjacent. 

f5 [0056] tn a first embodiment , the protein scaffold has a tertiary structure or fold equal or similar to the tertiary structure 
or fold of the S1 structural sutx:lass of serine proteases, i. e. the chymotrypsin fold, and/or has at least 70% identity on 
the amino add level to a protein of the S1 structural subdass of serine proteases. It is preferred that SDRs are inserted 
into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino 
add sequence homology to the regions 18-25. 38-48, 54-63. 73-86, 122-130. 148-156, 165-171 and 194-204 in human 

20 trypsin I, and more preferably at one or more positions from the group of positions that correspond structurally or by 
amino add sequence homology to the regions 20-23. 41-45, 57-60, 76-83, 125-128, 150-153, 167-169 and 197-201 
(numbering of amino adds according to SEQ ID NO:1). The number of SDRs to be combined with this type of protein 
scaffold is preferably between 1 and 10, and more preferably between 2 and 4. Preferably, the protein scaffold is equal 
to or is a derivative or homologue of one or more of the following proteins: chymotrypsin, granzyme. kallikrein, trypsin, 

25 mesotrypsin, neutrophil elastase, pancreatic elastase, enteropeptidase, cathepsin. thrombin, ancrod, coagulation factor 
IXa, coagulation factor Vila, coagulation factor Xa, activated protein C. urokinase, tissue-type plasminogen activator, 
plasmin, Desmodus-type plasminogen activator. More preferably, the protein scaffold is trypsin or thrombin or is a 
derivative or homologue from trypsin or thrombin. For the use as a human therapeutic, the trypsin or thrombin scaffold 
is most preferably of human origin in order to minimize the risk of an immune response or an allergenic reaction. 

30 [0057] Preferably, derivatives with improved charaderistics derived from human trypsin I or from proteins with similar 
tertiary structure are used. Preferred examples of such derivatives are derived from human trypsin I (SEQ ID NO.i) and 
comprise one or more of the following amino add substitutions E56G; R78W; Y131F; A146T; C183R. 
It is prefeaed that at least one of two SDRs are inserted into human trypsin I, or a derivative thereof, between residues 
42 and 43 (SDR 1) and between 123 and 124 (SDR 2), respectively (numbering of amino acids according to SEQ ID 

35 NO:1). In addition the SDR 1 has a preferred length of 6 and the SDR 2 has a preferred length of 5 amino acids, 
respectively. In a preferred variant of this emtxxliment, the SDR 1 and SDR 2 sequences comprise one of the amino 
add sequences listed in table 2. Such engineered proteolytic enzymes have specifidty for the target substrate B as 
exemplified in example IV. 

[0058] In a further embodiment the protein scaffold belongs to the S8 structural sutx^lass of serine proteases and/or 

^ has a tertiary stiucture simitar to subtilisin E from Badllus subtilis_and/or has at least 70% identity on the amino add 
level to a protein of the S8 structural sut>dass of serine proteases. Preferably, the scaffold kielongs to the subtilisin family 
or the human pro-protein convertases. It is preferred that SDRs are inserted into the protein scaffold at one or more 
positions from the group of positions that correspond structurally or by amino add sequence homology to the regions 
6-17, 25-29. 47-55. 59-69, 101-111, 117-125. 129-137, 139-154. 158-169, 185-195 and 204-225 in subtilisin E from 

^ Badllus subtilis, and more preferably at one or more positions from the group of positions that correspond structurally 
or by amino acid sequence homology to the regions 59-69, 101-111, 129-137, 158-169 and 204-225 (numbering of 
amino adds according to SEQ ID NO:7). It is preferred that the protein scaffold is equal to or is a derivative or homologue 
of one or more of the following proteins: subtilisin Carisberg; B. subtilis subtilisin E; subtilisin BPN'; B. licheniformis 
subtilisin; B. lentus subtilisin; Badllus alcalophilus alkaline protease; proteinase K; kexin; human pro-protein convertase; 

50 human furin. In a preferred variant, subtilisin BPN* or one of the proteins SPC 1 to 7 is used as the protein scaffold. 
[0059] In a further emk)odiment the protein scaffold belongs to the family of aspartic proteases and/or has a tertiary 
structure similar to human pepsin. Preferably, tiie scaffold belongs to the A1 dass of proteases and/or has at least 70% 
identity on the amino add level to a protein of the A1 dass of proteases. It is preferred that SDRs are inserted into the 
protein scaffold at one or more positions from the group of positions that correspond structurally or by amino acid 

55 sequence homology to the regions 6-18. 49-55, 74-83, 91-97. 112-120, 126-137, 159-164, 184-194, 242-247, 262-267 
and 277-300 in human pepsin, and more preferably at one or more positions from the group of positions that correspond 
structurally or by amino add sequence homotogy to ttie regions 10-15. 75-80. 1 14-1 18, 130-134, 186-191 and 280-296 
(numbering of amino adds according to SEQ ID NO:1 1 ). It is preferred that the protein scaffold is equal to or is a derivative 
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or homologue of one or more of the following proteins: pepsin, chymosin. renin, cathepsin, yapsin. Preferably, pepsin 
or endothiopepsin or a derivative or homologue thereof is used as the protein scaffold. 

[0060] in a further embodiment the protein scaffold belongs to the cysteine protease femily and/or has a tertiary 
struc^re similar to human caspase 7. Preferably the scaffold belongs to the C14 dass of cysteine proteases or has at 

s least 70% identity on the amino acid level to a protein of the C1 4 dass of cysteine proteases. It Is preferred that SDRs 
are inserted Into the protein scaffold at one or more positions from the group of positions that correspond structurally or 
by amino acid sequence homology to the regions 78-91, 144-160. 186-198, 226-243 and 271-291 in human caspase 
7. and more preferably at one or more positions from the group of positions that correspond structurally or by amino 
add sequence homology to the regions 80-86, 149-157. 190-194 and 233-238 (numbering of amino adds according to 

10 SEQ I D NO: 1 4). It is preferred that the protein scaffold is equal to or Is a derivative or homologue of one of the caspases 
1 to 9. 

[0061] In a further emt>odiment the protein scaffold belongs to the S1 1 dass of serine proteases or has at least 70% 
identity on the amino add level to a protein of the S1 1 class of serine proteases and/or has a tertiary structure similar 
to D-alanyl-D-alanine transpeptidase from Streptomyces spedes K15. It is preferred that SDRs are inserted into the 

15 protein scaffold at one or more positions from the group of positions that correspond structurally or by amino add 
sequence homology to the regions 67-79. 137-150, 191-206, 212-222 and 241-251 in D-alanyl-D-alanine transpeptidase 
from Streptomyces spedes K1 5, and more preferably at one or more positions from the group of positions that correspond 
structurally or by amino add sequence homology to the regions 70-75, 141-147, 195-202 and 216-220 (numt>ering of 
amino adds according to SEQ ID NO: 15). It Is preferred that the D-alanyl-D-alanine transpeptidase from Streptomyces 

20 spedes K1 5 or a derivative or homologue thereof is used as the scaffold. 

[0062] In a further embodiment the protein scaffold belongs to the S21 dass of serine proteases or has at least 70% 
Identity on the amino add level to a protein of the S21 class of serine proteases and/or has a tertiary structure similar 
to assemblin from human cytomegalovirus. It is preferred that SDRs are inserted into the protein scaffold at one or more 
positions from the group of positions that correspond structurally or by amino add sequence homology to the regions 

25 25-33, 64-69, 134-155. 162-169 and 217-244 in assemblin from human cytomegalovirus, and more preferably at one 
or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the 
regions 27-31, 164-168 and 222-239 (numbering of amino acids according to SEQ ID NO:16). It Is preferred that the 
assemblin from human cytomegalovirus or a derivative or homologue thereof is used as the scaffold. 
[0063] In a further embodiment the protein scaffold belongs to the S26 dass of serine proteases or has at least 70% 

30 identity on the amino add level to a protein of the S26 dass of serine proteases and/or has a tertiary structure simitar 
to the signal peptidase from Escherichia coll. It is preferred that SDRs are Inserted into the protein scaffold at one or 
more positions from the group of positions that correspond structurally or by amino acid sequence homology to the 
regions 8-14, 57-68, 125-134, 239-254. 200-211 and 228-239 in signal peptidase from Escherichia coll, and more 
preferably at one or more positions from the group of positions that correspond structurally or by amino add sequence 

35 homology to the regions 9-13, 60-67, 127-132 and 203-209 (numbering of amino adds according to SEQ ID NO:17). It 
is preferred that the signal peptidase firom Escherichia coll or a derivative or homologue thereof is used as the scaffold. 
[0064] In an further embodiment the protein scaffold belongs to the S33 class of serine proteases or has at least 70% 
identity on the amino add level to a protein of the S33 dass of serine proteases and/or has a tertiary structure similar 
to the prolyl aminopeptidase from Serratia marcescens. It is preferred that SDRs are inserted into the protein scaffold 

^ at one or more positions from the group of positions that correspond structurally or by amino add sequence homology 
to the regions 47-54, 152-160, 203-212 and 297-302 in prolyl aminopeptidase from Serratia marcescens, and more 
preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence 
homology to the regions 50-53, 154-158 and 206-210 (numbering of amino acids according to SEQ ID NO:18). It is 
preferred that the prolyl aminopeptidase from Serratia marcescens or a derivative or homologue thereof Is used as the 

45 scaffold. 

[0065] In a further embodiment the protein scaffold belongs to the S51 dass of serine proteases or has at least 70% 
identity on the amino acid level to a protein of the S5^ dass of serine proteases and/or has a tertiary strudure similar 
to aspartyl dipeptidase from Escherichia coll. It is preferred that SDRs are inserted into the protein scaffold at one or 
more positions fix)m the group of positions that correspond structurally or by amino add sequence homology to the 
50 regions 8-16, 3&46. 85-92, 132-140, 159-170 and 205-211 in aspartyl dipeptidase from Escherichia ooli, and more 
preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence 
homology to the regions 10-14, 87-90, 134-138 and 160-165 (numbering of amino adds according to SEQ ID NO:1 9). 
It is preferred that the aspartyl dipeptidase from Escherichia coli or a derivative or homologue thereof Is used as the 
scaffold. 

55 [0066] In a further embodiment the protein scaffold belongs to the A2 class of aspartic proteases or has at least 70% 
identity on the amino add level to a protein of the A2 dass of aspartic proteases and/or has a tertiary structure simitar 
to the protease from human immunodefidency virus. It is preferred that SDRs are Inserted into the protein scaffold at 
one or more positions from the group of positions that correspond structurally or by amino add sequence homology to 
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the regions 5-12, 17-23. 27-30. 33-38 and 77-83 in protease from human immunodeficiency virus, and more preferably 
at one or more positions from the group of positions that correspond structurally or by amino add sequence homology 
to the regions 7-10. 18-21 . 34-37 and 79-82 (numbering of amino acids according to SEQ ID NO:20). It is preferred that 
the protease from human immunodeficiency virus, preferably HIV-1 protease, or a derivative or homologue thereof Is 
5 used as the scaffold. 

[0067] In an further embodiment the protein scaffold belongs to the A26 class of aspartic proteases or has at least 
. 70% identity on the amino acid level to a protein of the A26 class of aspartic proteases and/or has a tertiary structure 
similar to the omptin from Escherichia coli. It is preferred that SDRs are inserted into the protein scaffold at one or more 
positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 
10 28-40. 86-98. 150-168. 213-219 and 267-278 in omptin from Escherichia coll, and more preferably at one or more 
positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 
33-38. 161-168 and 273-277 (numt)ering of amino adds according to SEQ ID NO:21). It is preferred that the omptin 
from Escherichia coli or a derivative or homologue thereof is used as the scaffold. 

[0068] In a further embodiment the protein scaffold belongs to the C1 dass of cysteine proteases or has at least 70% 
IS identity on the amino acid level to a protein of the CI dass of cysteine proteases and/or has a tertiary structure similar 
to the papain from Carica papaya. It is preferred that SDRs are inserted into the protein scaffold at one or more positions 
from the group of positions that correspond structurally or by amino add sequence homology to the regions 1 7-24, 61 -68. 
88-95. 135-142, 153-158 and 176-184 in papain from Carica papaya, and more preferably at one or more positions from 
the group of positions that correspond structurally or by amino add sequence homology to the regions 63-66. 136-139 
20 and 177-181 (numbering of amino adds according to SEQ ID NO:22). It is preferred that the papain from Carica papaya 
or a derivative or homologue thereof Is used as the scaffold. 

[0069] In a further emt)odiment the protein scaffold belongs to the C2 dass of cysteine proteases or has at least 70% 
identity on the amino acid level to a protein of the C2 dass of cysteine proteases and/or has a tertiary structure similar 
to human calpain-2. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the 

25 group of positions that correspond structurally or by amino add sequence homology to the regions 90-103. 160-172. 
193-199. 243-260. 286-294 and 316-322 in human calpain-2, and more preferably at one or more positions from the 
group of positions that correspond structurally or by amino acid sequence horology to the regions 92-1 01 . 245-250 and 
287-291 (numbering of amino adds according to SEQ ID NO:23). It is preferred that the human calpain-2 or a derivative 
or homologue thereof is used as the scaffold. 

30 [0070] In a further embodiment the protein scaffold belongs to the C4 dass of cysteine proteases or has at least 70% 
identity on the amino acid level to a protein of the C4 dass of cysteine proteases and/or has a tertiary structure similar 
to NIa protease from tobacco etch virus. It is preferred that SDRs are inserted into the protein scaffold at one or more 
positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 
23-31 . 1 12-120. 144-150, 168-176 and 205-218 in NIa protease from tobacco etch virus, and more preferably at one or 

35 more positions from the group of positions that correspond structurally or by amino add sequence homology to the 
regions 145-149. 169-174 and 212-218 (numbering of amino adds according to SEQ ID N024). It is preferred that the 
NIa protease from tobacco etch virus (TEV protease) or a derivative or homologue thereof is used as the scaffold. 
[0071] In a further embodiment the protein scaffold t)elongs to the CIO dass of cysteine proteases or has at least 
70% identity on the amino add level to a protein of the C10 dass of cysteine proteases and/or has a tertiary structure 

^ similar to the streptopain from Streptococcus pyogenes. It is preferred that SDRs are inserted into the protein scaffold 
at one or more positions from the group of positions that correspond structurally or by amino add sequence homology 
to the regions 81-90. 133-140. 150-164, 191-199. 219-229. 246-256, 306-312 and 330-337 in streptopain from Strep- 
tococcus pyogenes, and more preferably at one or more positions from the group of positions that correspond structurally 
or by amino add sequence homology to the regions 82-87, 134-138. 250-254 and 331-335 (numk>ering of amino adds 

^ according to SEQ ID NO:25). It is preferred thatthe streptopain from Streptococcus pyogenes or a derivative or homologue 
thereof is used as the scaffold. 

[0072] In a further embodiment the protein scaffold t)elongs to the C19 dass of cysteine proteases or has at least 
70% identity on the amino add level to a protein of the CI 9 class of cysteine proteases and/or has a tertiary structure 
similar to human ubiquitin specific protease 7. It is preferred that SDRs are inserted into the protein scaffold at one or 

so more positions from the group of positions that correspond structurally or by amino add sequence homology to the 
regions 3-15. 63-70. 80-86, 248-256, 272-283 and 292-304 in human ubiquitin specific protease 7, and more preferably 
at one or more positions from the group of positions that correspond structurally or by amino add sequence homology 
to the regions 10-15. 251-255, 277-281 and 298-304 (numbering of amino acids according to SEQ ID NO:26). It is 
preferred that the human ubiquitin spedfic protease 7 or a derivative or homologue thereof is used as the scaffold. 

ss [0073] In a further embodiment the protein scaffold belongs to the C47 dass of cysteine proteases or has at least 
70% identity on the amino add level to a protein of the C47 dass of cysteine proteases and/or has a tertiary structure 
similar to the staphopain firom Staphylococcus aureus. It is preferred that SDRs are inserted into the protein scaffold at 
one or more positions firom the group of positions that correspond structurally or by amino add sequence homology to 
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the regions 1 5-23. 57*66, 1 08-1 1 9, 1 42-149 and 1 57-164 in staphopain from Staphylococcus aureus, and more preferably 
at one or more positions from the group of positions that correspond structurally or by amino acid sequence homology 
to the regions 17-22, 111-117, 143-147 and 159-163 (numbering of amino acids according to SEQ ID NO:27). It is 
preferred that the staphopain from Staphylococcus aureus or a derivative or homologue thereof is used as the scaffold. 

5 [0074] In an further embodiment the protein scaffold belongs to the C48 class of cysteine proteases or has at least 
70% identity on the amino add level to a protein of the C48 class of cysteine proteases and/or has a tertiary structure 
similar to the Ulpl endopeptidase from Saccharomyces cerevisiae. It is preferred that SDRs are inserted into the protein 
scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence 
homology to the regions 40-51. 108-115, 132-141. 173-179 and 597-605 in Ulpl endopeptidase from Saccharomyces 

10 cerevisiae, and more preferably at one or more positions from the group of positions that con^spond structurally or by 
amino acid sequence homology to the regions43-49, 1 1 0-1 1 3, 1 33-1 37 and 1 75-1 78 (numbering of amino adds according 
to SEQ ID NO:28). It is preferred that the Ulpl endopeptidase from Saccharomyces cerevisiae or a derivative or homo- 
logue thereof is used as the scaffold. 

[0075] In a further embodiment the protein scaffold belongs to the C56 class of cysteine proteases or has at least 
IS 70% identity on the amino add level to a protein of the C56 class of cysteine proteases and/or has a tertiary structure 
similar to the Pfpl endopeptidase from Pyrococcus horikoshii. It is preferred that SDRs are inserted into the protein 
scaffold at one or more positions from the group of positions that coaespond structurally or by amino add sequence 
homology to the regions 8-16, 40-47, 66-73, 1 18-125 and 147-153 in Pfpl endopeptidase from Pyrococcus horikoshii, 
and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid 
20 sequence homology to the regions 9-14, 68-71 , 120-123 and 148-151 (numbering of amino adds according to SEQ ID 
NO:29). It is preferred that the Pfpl endopeptidase from Pyrococcus horikoshii or a derivative or homologue thereof is 
used as the scaffold. 

[0076] In a further embodiment the protein scaffold belongs to the M4 dass of metallo proteases or has at least 70% 
identity on the amino acid level to a protein of the M4 dass of metallo proteases and/or has a tertiary structure similar 

25 to thermolysin from Bacillus thermoproteolyticus. It is preferred that SDRs are inserted into the protein scaffold at one 
or more positions from the group of positions that correspond structurally or by amino acid sequence homology to the 
regions 106-118, 125-130, 152-160, 197-204, 210-213 and 221-229 in thermolysin firom Badllus thermoproteolyticus, 
and more preferably at one or more positions from the group of positions that correspond structurally or by amlrK> acid 
sequence homology to the regions 108-115, 126-129. 199-203 and 223-227 (numbering of amino acids according to 

30 SEQ ID NO:30). It is preferred that the thermolysin from Bacillus thermoproteolyticus or a derivative or homologue thereof 
is used as the scaffold. 

[0077] In a further embodiment the protein scaffold belongs to the M10 class of metallo proteases or has at least 70% 
identity on the amino acid level to a protein of the M10 dass of metallo proteases and/or has a tertiary structure similar 
to human collagenase. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the 
35 group of positions that correspond structurally or by amino acid sequence homology to the regions 2-7, 68-79, 85-90, 
107-1 11 and 135-141 in human collagenase, and more preferably at one or more positions from the group of positions 
that correspond structurally or by amino add sequence homology to the regions 3-6. 71-78 and 136-140 (numbering of 
amino acids according to SEQ ID NO:31). It is preferred that human collagenase or a derivative or homologue thereof 
is used as the scaffold. 

^ [0078] It is further preferred that the engineered enzymes have glycosidase activity. A particularly suited protein 
scaffold for this variant is a glycosylase or is derived from a glycosylase. Preferably, the tertiary structure t)elongs to one 
of ttie following structural dasses: dass GH13, GH7, GH12, GH1 1. GH10, GH28. GH26, and GH18 (beta/alpha)8 banrel. 
[0079] In a first emt>odiment the protein scaffold belongs to the GH 1 3 dass of glycosylases or has at least 70% identity 
on the amino add level to a protein of the GH13 dass of glycosylases and/or has a tertiary structure similar to human 

^ pancreatic alpha-amylase. It is preferred that SDRs are inserted Into the protein scaffold at one or more positions from 
the group of positions that correspond structurally or by amino add sequence homology to the regions 50-60. 100-1 10. 
148-167. 235-244, 302-310 and 346-359 in human pancreatic alpha-amylase. and more preferably at one or more 
positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 
51 -58. 148-1 55 and 303-309 (numbering of amino acids according to SEQ ID NO:32). It is prefen^ that human pancreatic 

50 alpha-amylase or a derivative or homologue thereof is used as the scaffold. 

[0080] In a further embodiment the protein scaffold belongs to the GH7 class of glycosylases or has at least 70% 
identity on the amino add level to a protein of the GH7 dass of glycosylases and/or has a tertiary structure similar to 
cellulase from Trichoderma reesei. It is preferred that SDRs are inserted into the protein scaffold at one or more positions 
from the group of positions that correspond structurally or by amino add sequence homology to the regions 47-56, 

55 93-104. 173-182, 21 5-223, 229-236 and 322-334 in cellulase from Trichoderma reesei, and more preferably at one or 
more positions from the group of positions that correspond structurally or by amino acid sequence homology to the 
regions 175-180, 218-222 and 324-332 (numbering of amino adds according to SEQ ID NO:33). It is prefenred tiiat 
cellulase from Trichoderma reesei or a derivative or homologue thereof is used as the scaffold. 
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[0081] In a further embodiment the protein scaffold belongs to the GH12 dass of glycosylases or has at least 70% 
identity on the amino acid level to a protein of the GH12 class of glycosylases and/or has a tertiary structure similar to 
cellulase from Aspergillus niger. It is preferred that SDRs are inserted into the protein scaffold at one or more positions 
from the group of positions that correspond structurally or by amino add sequence homology to the regions 18-28, 55-60, 
5 106-1 13, 126-132 and 149-159 in cellulase from Aspergillus niger, and more preferably at one or more positions from 
the group of positions that correspond structurally or by amino add sequence homology to the regions 20-26. 56-59. 
108-1 12 and 151 -156 (numbering ofamino adds according to SEQ 10 NO:34). It is preferred that cellulase from Aspergillus 
niger or a derivative or homologue thereof is used as the scaffold. 

[0082] In a further embodiment the protein scaffold belongs to the GH1 1 dass of glycosylases or has at least 70% 
10 identity on the amino add level to a protein of the GH1 1 class of glycosylases and/or has a tertiary structure similar to 
xylanase from Aspergillus niger. It is preferred that SDRs are inserted into the protein scaffold at one or more positions 
from the group of positions that correspond structurally or by amino add sequence homology to the regions 7-14. 33-39. 
88-97. 1 14-126 and 158-167 in xylanase from Aspergillus niger. and more preferably at one or more positions from the 
group of positions that correspond structurally or by amino add sequence homology to the regions 20-26. 56-59. 1 08-1 1 2 
IS and 1 51 -1 56 (numbering of amino acids according to SEQ ID NO:35). It is preferred that xylanase from Aspergillus niger 
or a derivative or homologue thereof is used as the scaffold. 

[0083] In a further embodiment the protein scaffold belongs to the GH10 dass of glycosylases or has at least 70% 
identity on the amino add level to a protein of the GH10 class of glycosylases and/or has a tertiary structure similar to 
xylanase from Streptomyces lividans. It is preferred that SDRs are inserted into the protein scaffold at one or more 

20 positions from the group of positions that correspond structurally or by amino add sequence homology to the regions 
21-29, 42-50. 84-92, 130-136. 206-217 and 269-278 in xylanase from Streptomyces lividans. and more preferably at 
one or more positions from the group of positions that correspond structurally or by amino add sequence homology to 
tiie regions 43-49. 86-90. 208-21 3 and 271 -276 (numbering of amino adds according to SEQ ID NO:36). It is prefenred 
that xylanase from Streptomyces lividans or a derivative or homologue thereof is used as the scaffold. 

25 [0084] In a further embodiment the protein scaffold belongs to the GH28 dass of glycosylases or has at least 70% 
identity on the amino add level to a protein of the GH28 dass of glycosylases and/or has a tertiary structure similar to 
pectinase from Aspergillus niger. It is preferred that SDRs are inserted into the protein scaffold at one or more positions 
from the group of positions that correspond structurally or by amino add sequence homology to the regions 82-88, 
118-126. 171-178. 228-236. 256-264 and 289-299 in pectinase from Aspergillus niger. and more preferably at one or 

30 more positions from the group of positions that correspond structurally or by amino add sequence homology to the 
regions 116-124, 174-178 and 291-296 (numbering of amino adds according to SEQ ID NO:37). It is preferred that 
pectinase from Aspergillus niger or a derivative or homologue thereof is used as the scaffold. 

[0085] In a further embodiment the protein scaffold belongs to the GH26 class of glycosylases or has at least 70% 
identity on the amino add level to a protein of the GH26 cigss of glycosylases and/or has a tertiary structure similar to 

3S mannanase from Pseudomonas cellulose. It is preferred that SDRs are inserted into the protein scaffold at one or more 
positions from the group of positions that correspond structurally or by amino add sequence homology to the regions 
75-83, 113-125, 174-182, 217-224. 247-254. 324-332 and 325-340 in mannanase from Pseudomonas cellulose, and 
more preferably at one or more positions from the group of positions that correspond structurally or by amino acid 
sequence homology to the regions 115-123, 176-180, 286-291 and 328-337 (numbering of amino acids according to 

40 SEQ ID NO:38). It is preferred that mannanase from Pseudomonas cellulose or a derivative or homologue thereof is 
used as the scaffold. 

[0086] In an further embodiment the protein scaffold belongs to the GH18 (beta/alpha)8 barrel dass of glycosylases 
or has at least 70% identity on the amino add level to a protein of the GH18 dass of glycosylases and/or has a tertiary 
structure similar to chitinase from Badllus drculans. It is preferred that SDRs are inserted into the protein scaffold at 

45 one or more positions from the group of positions that correspond structurally or by amino add sequence homology to 
the regions 21-29, 57-65, 130-136. 176-183, 221-229, 249-257 and 327-337 in chitinase from Badllus drculans. and 
more preferably at one or more positions from the group of positions that correspond structurally or by amino acid 
sequence homology to the regions 59-63, 178-181, 250-254 and 330-336 (numbering of amino adds according to SEQ 
ID NO:39). It is preferred that chitinase from Badllus drculans or a derivative or homologue thereof is used as the scaffold. 

50 [0087] It is further preferred that the engineered enzymes have esterhydrolase activity. Preferably, the protein scaffold 
for this variant have lipase, phosphatase, phytase, or phosphodiesterase activity. 

[0088] In a first emt>odlment the protein scaffold belongs to the GX class of esterases or has at least 70% identity on 
the amino add level to a protein of the GX dass of esterases and/or has a tertiary structure similar to the structure of 
the lipase B from Candida antarctica. Preferably, the scaffold has lipase activity. It is preferred that SDRs are inserted 
55 into the protein scaffold at one or more positions from the group of positions that correspond structurally or by amino 
add sequence homology to the regions 139-148. 188-195. 216-224, 256-266. 272-287 In lipase B from Candida ant- 
arctica, and more preferably at one or more positions from the group of positions that correspond structurally or by amino 
add sequence homology to the regions 141-146, 218-222, 259-263 and 275-283 (numbering of amino adds according 
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to SEQ ID NO:40). It is preferred that lipase B from Candida antarctica or a derivative or homologue thereof is used as 
the scaffold. 

[0089] In a further emt)odinnent the protein scaffold belongs to the GX dass of esterases or has at least 70% identity 
on the amino acid level to a protein of the GX dass of esterases and/or has a tertiary structure similar to the pancreatic 

s lipase from guinea pig. Preferably, the scaffold has lipase activity. It is preferred that SDRs are inserted into the protein 
scaffold at one or more positions from the group of positions that correspond structurally or by amino add sequence 
homology to the regions 78-90, 91-100. 112-120. 179-186. 207-218, 238-247 and 248-260 in pancreatic lipase from 
guinea pig, and more preferably at one or more positions from the group of positions that correspond structurally or by 
amino acid sequence homology to the regions 80-87, 114-118, 209-21 5 and 239-246 (numbering of amino adds according 

10 to SEQ ID NO:41). It is preferred that pancreatic lipase from guinea pig or a derivative or homologue thereof is used as 
the scaffold. 

[0090] In a further emt)odiment the protein scaffold has a tertiary structure similar to the structure of the alkaline 
phosphatase from Escherichia coli or has at least 70% identity on the amino add level to a protein that has a tertiary 
structure similar to the structure of the alkaline phosphatase from Escherichia coli. Preferably, the scaffold has phos- 

is phatase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group 
of positions that correspond structurally or by amino add sequence homology to the regions 1 1 0-1 22. 1 87-1 42. 1 70-1 75. 
186-193, 280-287 and 425-435 in alkaline phosphatase from Escherichia coli, and more preferably at one or more 
positions from the group of positions that correspond structurally or by amino acid sequence homology to the regions 
171-174, 187-191, 282-286 and 426-433 (numbering of amino adds according to SEQ ID NO:42). It is prefen^d ttiat 

20 alkaline phosphatase from Escherichia coli or a derivative or homologue thereof Is used as the scaffold. 

[0091] In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the bovine 
pancreatic desoxyrit)onudease I or has at least 70% identity on the amino add level to a protein that has a tertiary 
structure similar to the structure of the bovine pancreatic desoxyritxsnuclease I. Preferably, the scaffold has phosphodi- 
esterase activity. More preferably, a nudease, and most preferably, an unspedfic endonudease or a derivative thereof 

25 is used as the scaffold. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the 
group of positions that correspond structurally or by amino add sequence homology to the regions 14-21 , 41-47, 72-77, 
97-111, 135-143, 171-178, 202-209 and 242-251 in bovine pancreatic desoxyribonudease I. and more preferably at 
one or more positions from the group of positions that conrespond structurally or by amino add sequence homology to 
the regions 16-19. 42-46, 136-141 and 172-176 (numbering of amino adds according to SEQ ID NO:43). It is preferred 

30 that bovine pancreatic desoxyribonudease 1 or human desoxyrit)onuclease I or a derivative or homologue thereof is 
used as the scaffold. 

[0092] It is further preferred that the engineered enzyme has transferase activity. A particutariy suited protein scaffold 
for this variant Is a glycosyl-, a phospho- or a methyltransferase, or is a derivative thereof. Particulariy preferred protein 
scaffolds for this variant are glycosyltransferases or are derived from glycosyltransferases. The tertiary structure of the 
35 protein scaffold can be of any type. Preferably, however, the tertiary structure belongs to one of the following stiructural 
classes: GH13 and GT1. 

[0093] In a first embodiment the protein scaffold belongs to the GH 1 3 dass of transferases or has at least 70% identity 
on the amino add level to a protein of the GH1 3 dass of transferases and/or has a tertiary structure similar to the structure 
of the cydomaltodextrin glucanotransferase from Bacillus circulans. Preferably, the scaffold has transferase activity, 

^ and more preferably a glycosyltransferase is used as the scaffold. It is preferred that SDRs are inserted into the protein 
scaffold at one or more positions from the group of positions that correspond structurally or by amino acid sequence 
homology to the regions 38-48, 85-94, 142-154, 178-186. 259-266, 331-340 and 367-377 in cydomaltodexUin glucan- 
otransferase from Badllus drculans, and more preferably at one or more positions from the group of positions that 
correspond structurally or by amino add sequence homology to the regions 87-92, 180-185, 261-264 and 269-275 

45 (numbering of amino adds according to SEQ ID NO:44). It is preferred tiiat cydomaltodextrin glucanotransferase from 
Badllus circulans or a derivative or homologue thereof is used as the scaffold. 

[0094] In a further emt>odiment the protein scaffold belongs to the GT1 dass of tranferases or has at least 70% identity 
on the amino add level to a protein of the GT1 dass of transferases and/or has a tertiary structure similar to the structure 
of the glycosyltransferase from Amycolatopsis orientalis A82846. Preferably the scaffold has transferase activity, and 

50 more preferably glycosyltransferase activity. It is preferred that SDRs are inserted into the protein scaffold at one or 
more positions from the group of positions that corresponds structurally or by amino acid sequence homology to the 
regions 58-74, 130-138. 185-193, 228-236 and 314-323 in glycosyltransferase from Amycolatopsis orientalis A82846. 
and more preferably at one or more positions from the group of positions that correspond structurally or by amino acid 
sequence homology to the regions 61-71. 230-234 and 316-321 (numbering of amino adds according to SEQ ID NO: 

55 45) . It is preferred that the glycosyltransferase from Amycolatopsis orientalis A82846 or a derivative or homologue thereof 
is used as the scaffold. 

[0095] it is further preferred that the engineered enzymes have oxidoreductase activity. A particulariy suited protein 
scaffold for this variant is a monooxygenase, a dioxygenase or a alcohol dehydrogenase, or a derivative thereof. The 
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tertiary structure of the protein scaffold can be of any type. 

[0096] In a first embodiment the protein scaffold has a tertiary stmcture similar to the structure of the 2,3-diphydroxy- 
biphenyl dioxygenase from Pseudomonas sp. or has at least 70% identity on the amino add level to a protein that has 
a tertiary structure simitar to the structure of the 2,3-diphydroxyblphenyl dioxygenase from Pseudomonas sp. Preferably, 

s the scaffold has dioxygenase activity. It is preferred that SDRs are inserted Into the protein scaffold at one or more 
positions from the group of positions that correspond structurally or by amino add sequence homology to the regions 
172-185. 198-206. 231-237, 250-259 and 282-287 in 2.3-diphydroxybiphenyl dioxygenase from Pseudomonas sp.. and 
more preferably at one or more positions from the group of positions that correspond structurally or by amino acid 
sequence homology to the regions 175-182. 200-204, 252-257 and 284-287 (numbering of amino acids according to 

10 SEQ ID NO:46). It is preferred that the 2,3-diphydroxybiphenyl dioxygenase from Pseudomonas sp or a derivative or 
homologue thereof is used as the scaffold. 

[0097] In a further emt)Odlment the protein scaffold has a tertiary structure similar to the structure of the catechol 
dioxygenase from Adnetobacter sp. or has at least 70% identity on the amino add level to a protein that has a tertiary 
structure similar to the structure of the catechol dioxygenase from Adnetobacter sp.. Preferably, the scaffold has diox- 

IS ygenase activity, and more preferably catechol dioxygenase activity. It is preferred that SDRs are inserted into the protein 
scaffold at one or more positions from the group of positions that correspond structurally or by amino add sequence 
homology to the regions 66-72, 105-112, 156-171 and 198-207 in catechol dioxygenase from Acinetobacter sp., and 
more preferably at one or more positions from the group of positions that correspond structurally or by amino acid 
sequence homology to the regions 107-1 10, 161-171 and 201-205 (numbering of amino adds according to SEQ ID NO: 

20 47). It is preferred that the catechol dioxygenase from Adnetobacter sp or a derivative or homologue thereof is used as 
the scaffold. 

[0098] In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the camphor-5- 
monooxygenase from Pseudomonas putida or has at least 70% identity on the amino add level to a protein that has a 
tertiary structure similar to the structure of the camphor-5-monooxygenase from Pseudomonas putida. Preferably, the 

25 scaffold has monooxygenase activity, and more preferably camphor monooxygenase activity. It is preferred that SDRs 
are inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or 
by amino add sequence homology to the regions 26-31, 57-63, 84-98, 182-191, 242-256, 292-299 and 392-399 in 
camphor-5-monooxygenase from Pseudomonas putida, and more preferably at one or more positions from the group 
of positions that correspond structurally or by amino add sequence homology to the regions 85-96. 183-188, 244-253, 

30 293-298 and 393-398 (numbering of amino adds according to SEQ ID NO:48). It is preferred that the camphor-5- 
monooxygenase from Pseudomonas putida or a derivative or homologue thereof is used as the scaffold. 
[0099] In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the alcohol 
dehydrogenase from Equus callabus or has at least 70% identity on the amino acid level to a protein that has a tertiary 
structure similar to the structure of the alcohol dehydrogenase from Equus callabus. Preferably, the scaffold has alcohol 

35 dehydrogenase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the 
group of positions that correspond structurally or by amino add sequence homology to the regions 49-63, 111-112, 

294- 301 and 361-369 In alcohol dehydrogenase firom Equus callabus, and more preferably at one or more positions 
from the group of positions that correspond structurally or by amino acid sequence homology to the regions 51-61 and 

295- 299 (numbering of amino acids according to SEQ ID NO:49). It is preferred that the alcohol dehydrogenase from 
^ Equus callabus or a derivative or homologue thereof is used as the scaffold. 

[01 00] It is further prefenred that the engineered enzymes have lyase activity. A particulariy suited protein scaffold for 
this variant is a oxoacid lyase or is a derivative thereof. Particularly preferred protein scaffolds for this variant are aldolases 
or synthases, or are derived thereof. The tertiary structure of the protein scaffold can be of any type, but a (beta/alpha) 
8 barrel structure is preferred. 

^ [0101] In a first emt)odiment the protein scaffold has a tertiary structure simitar to the structure of the N-acetyl-d- 
neuramic acid aldolase from Escherichia coli or has at least 70% identity on the amino acid level to a protein that has 
a tertiary structure similar to the structure of the N-acetyl-d-neuramic add aldolase from Escherichia coli. Preferably, 
the scaffold has aldolase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions 
from the group of positions that correspond structurally or by amino acid sequence homology to the regions 45-55, 78-87, 

50 105-113, 137-146, 164-171. 187-193, 205-210, 244-255 and 269-276 in N-acetyl-d-neuramic add aldolase from Es- 
cherichia coli, and more preferably at one or more positions from the group of positions that correspond structurally or 
by amino add sequence homology to the regions 45-52, 1 38-144, 189-192, 247-253 and 271-275 (numbering of amino 
adds according to SEQ ID NO:50). It is preferred that the N-acetyl-d-neuramic acid aldolase from Escherichia coli or a 
derivative or homologue thereof is used as the scaffold. 

55 [0102] In a further embodiment the protein scaffold has a tertiary structure similar to the structure of the tryptophan 
synthase from Salmonella typhimurium or has at least 70% identity on the amino acid level to a protein that has a tertiary 
structure similar to the structure of the tryptophan synthase from Salmonella typhimurium. Preferably, the scaffold has 
synthase activity. It is preferred that SDRs are inserted into the protein scaffold at one or more positions from the group 
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of positions that correspond structurally or by amino acid sequence homology to the regions 56-63, 127-134, 154-161, 
175-193, 209-216 and 230-240 in tryptophan synthase from Salmonella typhimurium, and more preferably at one or 
more positions from the group of positions that correspond structurally or by amino acid sequence homology to the 
regions 57-62, 155-160, 178-190 and 210-215 (numbering of amino adds according to SEQ ID NO:51). It is preferred 
s that the tryptophan synthase from Salmonella typhimurium or a derivative or homologue thereof Is used as the scaffold. 
[01 03] It is further preferred that the engineered enzymes have isomerase activity. A particularly suited protein scaffold 
for this variant is a converting aldose or a converting ketose, or is a derivative thereof. 

[01 04] In a first embodiment, the protein scaffold has a tertiary structure similar to the structure of the xylose isomerase 
from Actinoptanes missouriensis or has at least 70% identity on the amino acid level to a protein that has a tertiary 

10 structure similar to the structure of the xylose isomerase from Actinoplanes missouriensis. It is preferred that SDRs are 
inserted into the protein scaffold at one or more positions from the group of positions that correspond structurally or by 
amino acid sequence homology to the regions 18-31, 92-103, 136-147, 178-188 and 250-257 in xylose isomerase from 
Actinoplanes missouriensis, and more preferably at one or more positions from the group of positions that correspond 
structurally or by amino acid sequence homology to the regions 20-27, 92-99 and 180-186 (numbering of amino adds 

IS according to SEQ ID NO:52). It is preferred that the xylose isomerase from Actinoplanes missouriensis or a derivative 
or homologue thereof is used as the scaffold. 

[01 05] It is further preferred that the engineered enzymes have ligase activity. A particulariy suited protein scaffold for 
this variant is a DNA ligase, or is a derivative thereof. 

[0106] In a first embodiment, the protein scaffold has a tertiary structure similar to the stiucture of the DNA ligase from 
20 Bacteriophage T7 or has at least 70% identity on the amino add level to a protein that has a tertiary structure similar to 
the structure of the DNA-ligase from Bacteriophage T7. It is preferred that SDRs are Inserted into the protein scaffold 
at one or more positions from the group of positions that correspond structurally or by amino add sequence homology 
to the regions 52-60, 94-108. 1 19-131 , 241-248, 255-263 and 302-318 in DNA ligase from Bacteriophage T7, and more 
preferably at one or more positions from the group of positions that correspond structurally or by amino acid sequence 
25 homology to the regions 96-106, 121-129, 256-262 and 304-316 (numbering of amino adds according to SEQ ID NO: 
53). It is preferred that the DNA ligase from Bacteriophage T7 or a derivative or homologue thereof is used as the scaffold. 
[0107] A second aspect is directed to the application of engineered enzymes with spedfidties for therapeutic, research, 
diagnostic, nutritional, personal care or industrial purposes. The application comprises at least the following steps: 

30 (a) identification of a target peptide substrate whose hydrolysis has a positive effect in connection with the intended 

purpose, such as curing a disease, diagnosing a disease, processing of ingredients for human or animal nutrition, 
or other technical processes; 

(b) provision of an engineered enzyme, the enzyme being spedfic for the target peptide identified in step (a); and 

(c) use of the enzyme as provided in step (b) for the intended purpose. 

35 

[0108] In a first variant of this aspect, the engineered enzyme is used as a therapeutic means to inactivate a disease- 
related target substrate. This application comprises at least the following steps: 

(a) identification of a target substrate whose function is connected to a disease and whose Inactivation has a positive 
40 effect in connection with the disease, and determination of a target site within the target suk)strate characterized by 

the fact that modification at the target site leads to the inactivation of the target substrate; 

(b) provision of an engineered enzyme, the enzyme being spedfic for the target site identified in step (a); and 

(c) use of the enzyme for the inactivation of the target substrate inside or outside the human body. 

^ [0109] In a preferred embodiment the scaffold of the engineered enzyme provided in step (c) is of human origin in 
order to avoid or reduce immunogenlcity or allergenic effects assodated with the application of the enzyme in the human 
body. In a more preferred emt)odimentof this variant, the scaffold is of a human protease and the modification is hydrolysis 
of a target site in a protein target. Preferably, the hydrolysis leads to the activation or inactivation of the peptide or protein 
target. Potential peptide or protein targets indude: cytokines, growth factors, peptide hormones, interieukins, interferons, 

so enzymes from the coagulation cascade, serpins, immunoglobulins, soluble or membrane-t>ound receptors, cellular or 
viral surface proteins, peptide drugs, protein drugs. 

[0110] A particulariy preferred embodiment is based on the finding that the engineered enzyme is capable for the 
cleavage of human tumor nekrose factor-alpha (TNF-a). The engineered enzymes or the fusion protein can tiius be 
used for preparing medicaments for the treatment of inflammatory diseases (as well as other diseases connected with 
55 TNF-a). Preferably, said engineered enzyme or said fusion protein is capable of spedfically inactivating human tumor 
nekrose factor-alpha (hTNF-a), more preferably said engineered enzyme or said fusion protein is capable of hydrolysing 
the peptide bond between positions 31/32. 32/33, 44/45. 87/88, 128/129 and/or 141/142 (most prefen-ed between po- 
sitions 31/32 and 32/33) in hTNF-a (SEQ ID NO:96). 
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[0111] In further embodiment, the target substrate is a pro-dnjg which is activated by the engineered enzyme. In a 
particular embodiment of this variant, the engineered enzyme has proteolytic activity and the target substrate Is a protein 
target which is proteolytically activated. Examples of such pro-drugs are pro-proteins such as the inactivated fonms of 
coagulations factors. In another particular variant, the engineered enzyme is an oxidoreductase and the target substrate 

5 is a chemical that can be activated by oxidation. 

[0112] In a second variant of this aspect, the engineered enzyme Is used as a technical means in order to catalyze 
an industrially or nutritionally relevant reaction with defined specificity. In a particular embodiment of this variant the 
engineered enzyme has proteolytic activity, the catalyzed reaction is a proteolytic processing, and the engineered enzyme 
specifically hydrolyses one or more industrially or nutrionally relevant protein substrates. In a preferred embodiment of 

fo this variant the engineered enzyme hydrolyses one or more industrially or nutrionally relevant protein substrates at 
specific sites, thereby leading to industrially or nutrionally desired product properties such as texture, taste or precipitation 
characteristics. In a further particular emtXKJiment of this variant, the engineered enzyme catalyzes the hydrolysis of 
glycosldic tx)nds (glycosidase or glycosylases activity). Then, preferably, the catalyzed reaction Is a polysaccharide 
processing, and the engineered enzyme specifically hydrolyses one or more industrially, technically or nutrionally relevant 

IS polysaccharide substrates. In a further particular embodiment of this variant, the engineered enzyme catalyzes the 
hydrolysis of triglyceride esters or lipids (lipase activity). 

[0113] Then, preferably, the catalyzed reaction is a lipid processing step, and the engineered enzyme specifically 
hydrolyses one or more industrially, technically or nutrionally relevant lipid substrates. In a further particular variant of 
this embodiment, the engineered enzyme catalyzes the oxidation or reduction of substrates (oxidoreductase activity). 
20 Then, preferably, the engineered enzyme specifically oxidizes or reduces one or more industrially, technically or nutri- 
onally relevant chemical substrates. 

[0114] A third aspect is directed to a method for generating engineered enzymes with specificities that are qualitatively 
and/or quantitatively novel in combination with the protein. scaffold. The method comprises at least the following steps: 

25 (a) providing a protein scaffold capable to catalyze at least one chemical reaction on at least one target substrate, 

(b) generating a library of engineered enzymes or isolated engineered enzymes by combining the protein scaffold 
firom step (a) with one or more fully or partially random peptide sequences at sites in the protein scaffold that enable 
the resulting engineered enzyme to discriminate between at least one target substrate and one or more different 
substrates and 

30 (c) selecting out of the library of engineered enzymes generated in step (b) one or more enzymes that have defined 

specificities towards at least one target substrate. 

[01 15] In a first variant of this aspect, the method comprises at least the following steps: 

35 (a) providing a protein scaffold capable to catalyze at least one chemical reaction on at least one target substrate, 

(b) generating a library of engineered enzymes or isolated engineered enzymes by inserting Into the protein scaffold 
from step (a) one or more fully or partially random peptide sequences at sites in the protein scaffold that enable the 
resulting engineered enzyme to discriminate between at least one target substrate and one or more different sub- 
strates and 

^ (c) selecting out of the library of engineered enzymes generated in step (b) one or more enzymes that have defined 

specificities towards at least one target substrate. 

[01 16] Preferably, the positions at which the one or more fully or partially random peptide sequences are combined 
with or inserted into the protein scaffold are identified prior to the combination or insertion. 

^ [01 1 7] The number of insertions or other combinations of fully or partially random peptide sequences as well as their 
length may vary over a wide range. The numt)er Is at least one. preferably more than one, more preferably between two 
and eleven, most preferably between two and six. The length of such fully or partially random peptide sequences is 
usually less than 50 amino acid residues. Preferably, the length is between one and 15 amino acid residues, more 
preferably between one and six amino add residues. Alternatively, the length is between two and 20 amino add residues. 

so preferably between two and ten amino add residues, more preferably between three and eight amino add residues. 
[0118] Preferably such insertions or other combinations are performed on the DNA level, using potynudeotides en- 
coding such protein scaffolds and polynudeotides or oligonudeotides encoding such fully or partially random peptide 
sequences. 

[01 19] Optionally, steps (a) to (c) are repeated cydically. whereby enzymes selected in step (c) serve as the protein 
55 scaffold in step (a) of a further cyde. and randomized peptide sequences are either inserted or. alternatively, substituted 
for peptide sequences that have been inserted in former cycles. Thereby, the number of inserted peptide sequences is 
either constant or increases over the cydes. The cydes are repeated until one or more enzymes with the intended 
spedfidties are generated. 
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[0120] Moreover, during or after one or more rounds of steps (a) to (c), the scaffold may be mutated at one or more 
positions in order to make the scaffold more acceptable for the combination with SDR sequences, and/or to increase 
catalytic activity at a specific pH and temperature, and/or to change the glycosylation pattem. and/or to decrease sensitivity 
towards enzyme inhibitors, and/or to change enzyme stability. 

[0121] In a second variant of this aspect, the method comprises at least the following steps: 

(a) providing a first protein scaffold fragment, 

(b) connecting said protein scaffold fragment via a peptide linkage with a first SDR. and optionally 

(c) connecting the product of step (b) via a peptide linkage with a further SDR peptide or with a further protein 
scaffold fragment, and optionally 

(d) repeating step (c) for as many cydes as necessary in order to generate a sufficiently specific enzyme, and 

(e) selecting out of the population generated in steps (a) - (d) one or more enzymes that have the desired specificities 
toward tiie one or more target substrates. 

[0122] Protein scaffold fragment means a part of the sequence of a protein scaffold. A protein scaffold is comprised 
of at least two protein scaffold fragments. 

[0123] In a third variant of this aspect, the protein scaffold, the SDRs and the engineered enzyme are encoded by a 
DNA sequence and an expression system is used in order to produce the protein. In an altemative variant, the protein 
scaffold, the SDRs and/or the engineered enzyme are chemically synthesized from peptide building blocks. 
[0124] In a fourth variant of this aspect, the method comprises at least the following steps: 

(a) providing a polynucleotide encoding a protein scaffold capable of catalyzing one or more chemical reactions on 
one or more target substrates; 

(b) combining one or more fully or partially random oligonucleotide sequence with the polynucleotide encoding the 
protein scaffold, the fully or partially random oligonucleotide sequences being located at sites in the polynucleotide 
that enable the encoded engineered enzyme to discriminate k>etween the one or more target substrates and one or 
more other substrates; and 

(c) selecting out of the population generated in step (b) one or more polynucleotides that encode enzymes ttiat have 
the defined specificities toward the one or more target suk)strates. 

[0125] Any enzyme can serve as the protein scaffold in step (a). It can be a naturally occurring enzyme, a variant or 
a truncated derivate therefore, or an engineered enzyme. For human therapeutic use. the protein scaffold is preferably 
a mammalian enzyme, and more preferably a human enzyme. In that aspect, the is directed to a method for the generation 
of essentially mammalian, especially of essentially human enzymes with specificities that are different from specificities 
of any enzyme encoded in mammalian genomes or in the human genome, respectively. 

[01 26] The protein scaffold provided in step (a) of this aspect requires to be capable of catalyzi ng one or more chemical 
reactions on a target substrate. Therefore, a protein scaffold is selected firom the group of potential protein scaffolds by 
its activity on the target substrate. 

[0127] In a preferred variant of this aspect, a protein scaffold with hydrolase activity is used. Preferably, a protein 
scaffold with proteolytic activity is used, and more preferably, a protease with very low specificity having basic activity 
on the target substrate is used as the protein scaffold. Examples of proteases from different structural classes with low 
substrate specificity are Papain, Trypsin, Chymotrypsin, Subtilisin, SET (trypsin-like serine protease from Streptomyces 
erythraeus), Elastase, Cathepsin G or Chymase. Before being employed as the protein scaffold, the amino acid sequence 
of the protease may be modified in order to change protein properties other than specificity, e.g catalytic activity, stability, 
inhibitor sensitivity, or expression yield, essentially as described in WO 92/18645, or in order to change specificity, 
essentially as described in EP 02020576.3 and PCT/EP03/04864. 

[0128] Another option for a feasible protein scaffold are lipases. Hepatic lipase, lipoprotein lipase and pancreatic lipase 
belong to the "lipoprotein lipase superfamily", which in turn is an example of the GX-class of lipases (M. Fischer. J. Pleiss 
(2003). Nucl. Acid. Res., 31 . 319-321). The substrate specificity of lipases can be characterized by their relative activity 
towards triglycerol esters of fatty adds and phospholipids, bearing a charged head group. Altematively, other hydrolases 
such as esterases, glycosylases, amidases, or nitrilases may be used as scaffolds. 

[01 29] Transferases are also feasible protein scaffolds. Glycoslytransferases are involved In many biological synthesis 
involving a variety of donors and acceptors. 

[0130] Altematively, the protein scaffold may have ligase, lyase, oxidoreductase, or isomerase activity. 
[0131] In a first embodiment , the one or more fully or partially random peptide sequences are inserted at specific sites 
in the protein scaffold. These insertion sites are characterized by the fact that the inserted peptide sequences can act 
as discriminators between different substrates, i.e. as Specificity Determining Regions or SDRs. Such insertion sites 
can be identified by several approaches. Preferably, insertion sites are identified by analysis of the three-dimensional 
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structure of the protein scaffolds, by comparative analysis of the primary sequences of the protein scaffold with other 
enzymes having different quantitative specificities, or experimentally by techniques such as alanine scanning, random 
mutagenesis, or random deletion, or by any combination thereof. 

[0132] A first approach to identify insertion sites for SORs bases on the three-dimensional structure of the protein 
s scaffold as it can be obtained by x-ray crystallography or by nuclear magnetic resonance studies. Structural alignment 
of the protein scaffold in comparison with other enzymes of the same structural dass but having different quantitative 
specificities reveals regions of high structural similarity and regions with low structural similarity. Such an analysis can 
for example be done using public software such as Swiss PDB viewer (Guex, N. and Peitsch, M.C. (1997) Electrophoresis 
18, 2714-2723). Regions of low structural similarity are prefen'ed SDR insertion sites. 
fo [0133] In a second approach to identify insertion sites for SDRs, three^ifmensional structures of the scaffold protein 
in complex with competitive inhibitors or substrate analogs are analysed. It is assumed that the binding site of a competitive 
inhibitor significantly overlaps with the binding site of the substrate. In that case, atoms of the protein that are within a 
certain distance of atoms of the inhibitor are likely to k>e in a similar distance to the substrate as well. Choosing a short 
distance, e.g. < 5 A. will result in an ensemble of protein atoms that are in close contact with the substrate. These 
IS residues would constitute the first shell contacts and are therefore preferred insertion sites for SDRs. Once first shell 
contacts have been identified, second shell contacts can be found by repeating the distance analysis starting from first 
shell atoms. In yet another alternative the distance analysis described above is performed starting from the active site 
residues. 

[0134] In third approach to identify insertion sites for SDRs. the primary sequence of the scaffold protein is aligned 

20 with other enzymes of the same structural class but having different quantitative specificities using an alignment algorithm. 
Examples of such alignment algorithms are published (AKschul. S.F.. Gish. W.. Miller. W., Myers. E.W. & Lipman. D.J. 
(1990) J. Mol. Biol. 215:403-410; "Statistical methods in Bioinformatics: an introduction" by Ewens, W. & Grant, G.R. 
2001 , Springer, New York). Such an alignment may reveal conserved and non-conserved regions with varying sequence 
homology, and. in particular, additional sequence elements in one or more enzymes compared to the scaffold protein. 

2S Conserved regions of are more likely to contribute to phenotypes shared among the different proteins, e.g. stabilizing 
the three-dimensional fold. Non-conserved regions and, in particular, additional sequences in enzymes with quantitatively 
higher specificity (T umer, R. et al. (2002) J. Biol. Chem., 277, 33068-33074) are preferred insertion sites for SDRs. 
[0135] For proteases currently five families are known, namely aspartic-. cysteine-, serine-, metallo- and threonine 
proteases. Each family includes groups of proteases that share a similar fold. Crystal lographic structures of members 

30 of these groups have been solved and are accessible through public datat)ases. e.g. the Brookhaven protein database 
(H.M. Berman et al. Nucleic Acids Research, 28 pp. 235-242 (2000)). Such databases also include stiuctural homologs 
in other enzyme classes and nonenzymatically active proteins of each class. Several tools are available to search public 
databases for structural homologues: SCOP - a structural classification of proteins database for the investigation of 
sequences and structures. (Murzin A. G. et al. (1995) J. Mot. Biol. 247. 536-540); CATH - Class, Architecture, Topology 

35 and Homologous superfamily: a hierarchical classification of protein domain structures (Orengo et al. (1997) Structure 
5(8) 1093-11 08); FSSP - Fold classification based on structure-structure alignment of proteins (Holm and Sander (1998) 
Nucl. Acids Res. 26 316-319); or VAST- Vector alignment search tool (Gibrat. Madej and Bryant (1996) Cun^ent Opinion 
in Structural Biology 6. 377-385). 

[0136] In the above described approaches, members of stiructural classes are compared in order to identify insertion 
^ sites for SDRs. 

[01 37] In a preferred variant of these approaches serine proteases of the structural class SI are compared with each 
other. Trypsin represents a member with low substrate specificity, as it requires only an arginine or lysine residue at the 
P^ position. On ttie other hand, thrombin, tissue-type plasminogen activator or enteroklnase all have a high specificity 
towards their substrate sequences, i.e. (Lyi/V/F)XPR^NA. CPGR'^WGG and DDDK'^, respectively (Perona. J. & Craik. 

45 C. (1997) J. Biol. Chem., 272, 29987-29990; Perona, J. & Craik, C (1995) Protein Science, 4, 337-360). An alignment 
of tiie amino acid sequences of these proteases is described in example 1 (Figure 2) along with the identification of SDRs. 
[01 38] A further example within the family of serine proteases is given by members of the structural class S8 (subtilisln 
fold). Subtilisin is the type protease for this dass and represents an unspecific protease (Ottesen.M. & Svendsen,A. 
(1998) Methods Enzymol. 19, 199-215). Furin. PCI and PC5 are proteases of the same structural dass involved in the 

so processing of propeptides and have a high substrate specificity (Seidah, N. & Chretien, M. (1997) Curr. Opin. Bk>tech., 
8: 602-607; Bergeron, F. et al. (2000) J. Mol. Endocrin., 24:1-22). In a preferred variant of the approach alignments of 
the primary amino adds sequences (Figure 4) are used to identify eleven sequence stretches longer than three amino 
adds which specific proteases have in addition compared to subtilisin and are therefore potential specificity determining 
regions. In a further variant of the approach information from the three-dimensional structure of subtilisin can be used 

ss in order to further narrow down the selection (Figure 3). Out of the eleven inserted sequence stretches, three are espedally 
dose to the active site residues, namely stretch number 7, 8 and 11 which are insertions in PC5. PCI and all three 
specific proteases, respectively (Figure 3). In a preferred variant, one or several amino add stretches of variable length 
and composition can t>e inserted Into the subtilisin sequence at one or several of the eleven positions. In a more preferred 
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variant of the approach the insertion is perfomned at regions 7, 8 or 11 or any combination thereof. In another preferred 
variant of the approach protease scaffolds other than subtilisin from the structural class S8 are used. 
[0139] In a further preferred variant of this approach, aspartic acid proteases of the structural dass A1 are analyzed 
(Rawlings. N.D. & Barrett. A.J. (1995). Methods Enzymol. 248, 105-120; Chltpinityol. S. & Crabbe, MJ. (1998), Food 

5 Chemistry. 61, 395-418). Examples for the A1 structural class of aspartic proteases are pepsin with a low as well as 
beta-secretase (Grunlnger-Leitch, F., et al. (2002) J. Biol. Chem. 277, 4687-4693) and renin (Wang, W. & Liang, TC. 
(1994) Biochemistry, 33, 14636-14641) with relatively high substrate specificities. Retroviral proteases also belong to 
this class, although the active enzyme is a dimer of two identical subunits. The viral proteases are essential for the 
correct processing of the polyprotein precursor to generate functional proteins which requires a high substrate specificity 

10 in each case (Wu, J. etal. (1998) Biochemistry, 37. 4518-4526; Pettit, S. et al. (1991) J. Biol. Chem., 266, 14539-14547). 
Pepsin is the type protease for this dass and represents an unspedfic protease (Kageyama, T. (2002) Cell. Mol. Life 
Sci. 59, 288-306). B-secretase and Cathepsin D (Aguilar, C. F. et al. (1995) Adv. Exp. Med. Biol. 362. 155-166) are 
proteases of the same structural class and have a high substrate spedfidty. In a preferred variant of the approach 
alignments of the primary amino adds sequences (Figure 6) are used to identify six sequence stretches longer than 

15 three amino acids which are inserted in the spedfic proteases compared to pepsin and are therefore potential specificity 
determining regions. In a further variant of the approach information from the three-dimensional structure of fc>-secretase 
can be used in order to further narrow down the selection. Out of the six inserted sequence stretches, three are espedally 
close to the active site residues, namely stretch number 1 , 3 and 4 which are insertions in cathepsin D and beta-secretase, 
respectively (Figure 5). In a preferred variant of the approach, one or several amino acid stretches of variable length 

20 and composition can be inserted into the pepsin sequence at one or several of the sbc positions. In a more preferred 
embodiment the insertion is performed at the positions 1 , 3 or 4 or any combination thereof. In another preferred em- 
bodiment protease scaffolds other than pepsin are used. 

[0140] There are cases where a certain structural class does not include known members of low and high specifidty. 
This is exemplified by the CI 4 class of caspases which belong to the cysteine protease family (Rawlings. N.D. & Barrett, 
25 A. J. (1994) Methods Enzymol. 244, 461-486) and which all show high specificity for P4 to P^ positions. For example, 
caspase-1, caspase-3 and caspase-9 recognize the sequences YVAD^, DEVD^ or LEHO^, respectively. Identification 
of the regions that differ between the caspases will Indude the regions responsible for the differences in substrate 
specifidty (Figures 7 and 8). 

[0141] Finally, non-enzymatic proteins of the same fold as the enzyme scaffold may also contribute to the identification 
30 of insertion sites for SDRs. For example, haptoglobin (Arcoleo, J. & Greer, J.; (1982) J. Biol. Chem. 257, 10063-10068) 
and azurocidin (Almeida, R. etal. (1991) Biochem. Biophys. Res. Commun. 177, 688-695) share the same chymotrypsin- 
like fold with all SI proteases. Due to substitutions in the active site residues these proteins do not posses any proteolytic 
function, yet they show high homology with active proteases. Differences between these proteins and spedfic proteases 
indude regtons that can serve as insertion sites for SDRs. 
35 In a fourth approach, insertion sites for SDRs are identified experimentally by techniques such as alanine scanning, 
random mutagenesis, random insertion or random deletion. In contrast to the approach disdosed above, this approach 
does not require detailed knowledge about the three-dimensional structure of the scaffold protein. In one preferred 
variant of this approach, random mutagenesis of enzymes with relatively high spedfidty from the same structural dass 
as the protein scaffold and screening for loss or change of spedfidty can be used to identify insertion sites for SDRs in 
40 the protein scaffold. 

Random mutagenesis, alanine scanning, random insertion or random deletion are all done on the level of the polynu- 
cleotides encoding the enzymes. There are a variety of protocols known in the literature (e.g. Sambrook, J.F; Fritsch, 
E.F.; Maniatis.T.; Cold Spring Hartx)r Laboratory Press. Second Edition, 1989. New York). For example, random mu- 
tagenesis can be achieved by the use of a polymerase as described in patent WO 9218645. According to this patent. 

^ the one or more genes encoding the one or more proteases are amplified by use of a DNA polymerase with a high error 
rate or under conditions that increase the rate of misincorporations. For example the method of Cadwell and Joyce can 
be employed (Cadwell, R.C. and Joyce, G.F., PCR methods. Appl. 2 (1992) 28-33). Other methods of random muta- 
genesis such as, but not limited to, the use of mutator stains, chemical mutagens or UV-radiation can be employed as well. 
Alternatively, oligonudeotides can be used for mutagenesis that substitute randomly distributed amino add residues 

so with an alanine. This method is generally referred to as alanine scanning mutagenesis (Fersht. A.R. Biochemistry ( 1 989) 
8031-8036). As a further altemative. modifications of the alanine scanning mutagenesis such as binominal mutagenesis 
(Gregoret, L.M. and Sauer, R.T. PNAS (1993) 4246-4250) or combinatorial alanine scanning (Weiss et al., PNAS (2000) 
8950-8954) can be employed. 

[0142] In order to express engineered enzymes, the DNA encoding such engineered proteins is ligated into a suitable 
55 expression vector by standard molecular doning techniques (e.g. Sambrook, J.F; Fritsch, E.F.; Maniatis, T.; Cold Spring 
Hart}or Laboratory Press, Second Edition. 1989. New York). The vector is introduced in a suitable expression host cell, 
which expresses the corresponding engineered enzyme variant. Particularly suitable expression hosts are bacterial 
expression hosts such as Escherichia coli or Badllus subtilis, or yeast expression hosts such as Saccharomyoes oerevisae 
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or Pichia pastoris. or mamnnalian expression hosts such as Chinese Hamster Ovary (CHO) or Baby Hamster Kidney 
(BHK) cell lines, or viral expression systems such as bacteriophages like Ml 3 or Lambda, or viruses such as the 
Baculovirus expression system. As a further attemative, systems for in vitro protein expression can be used. Typically, 
the DNA is ligated into an expression vector behind a suitable signal sequence that leads to secretion of the enzyme 

5 variants into the extracellular space, thereby allowing direct detection of protease activity in the cell supematant. Par- 
ticularly suitable signal sequences for Escherichia coli are HlyA, for Bacillus subtills AprE, NprB. Mpr, AmyA, AmyE. 
Blac. SacB, and for S. cerevisiae Bar1, Suc2, Mata, lnu1A, Ggplp. Alternatively, the enzyme variants are expressed 
intraceltularly and the substrates are expressed also intracellularly. Preferably, this is done essentially as described in 
patent application WO 0212543, using a fusion peptide substrate comprising two auto-fluorescent proteins linked by the 

10 substrate amino-add sequence. As a further altemative, after intracellular expression of the enzyme variants, or secretion 
into the periplasmatic space using signal sequences such as DsbA. PhoA, PelB. OmpA, OmpT or gill for Escherichia 
coll, a permeabilisation or lysis step releases the enzyme variants into the supematant. The destruction of the membrane 
barrier can be forced by the use of mechanical means such as ultrasonic, French press, orthe use of membrane-digesting 
enzymes such as lysozyme. As another, further altemative, the genes encoding the enzyme variants are expressed 

IS cell-free by the use of a suitable cell-free expression system. For example, the S30 extract from Escherichia coli cells 
Is used for this purpose as described by Lesly et al. (Methods In Molecular Biology 37 (1995) 265-278). 
The ensemble of gene variants generated and expressed by any of the above methods are analyzed with respect to 
their affinity, substrate specificity or activity by appropriate assay and screening methods as described in detail for 
example in patent application PCT/EP03/04864. Genes from catalytically active variants having reduced specificity In 

20 comparison to the original enzyme are analyzed by sequencing. Sites at which mutatbns and/or insertions and/or 
deletions occurred are preferred insertion sites at which SDRs can be Inserted site-specifically. 
[0143] In a second embodiment, the one or more fully or partially random peptide sequences are inserted at random 
sites in the protein scaffold. This modification Is usually done on the polynucleotide level, i.e. by inserting nucleotide 
sequences into the gene that encodes the protein scaffold. Several methods are available that enable the random 

25 Insertion of nucleotide sequences. Systems that can be used for random insertion are for example ligation based systems 
(Murakami et al. Nature Biotechnology 20 (2002) 76-81). systems based on DNA polymerisation and transposon based 
systems (e.g. GPS-M™ mutagenesis system, NEB Biolabs; MGS"** mutation generation system, Finnzymes). The tirans- 
poson-based methods employ a transposase-mediated insertion of a selectable marker gene that contains at its termini 
recognition sequences for the transposase as well as two sites for a rare cutting restriction endonuclease. Using the 

30 latter endonuclease one usually releases the selection marker and after rellgation obtains an insertion. Instead of per- 
forming the religation one can altematively Insert a fragment that has terminal recognition sequences for one or two 
outside cutting restriction endonuclease as well as a selectable marker. After ligation, one releases this fragment using 
the one or two outside cutting endonucleases. After creating blunt ends by standard methods one inserts blunt ended 
random fragments at random positions into the gene. 

3S In a further preferred embodiment, methods for homotogous in-vitro recombination are used to combine the mutations 
introduced by the above mentioned methods to generate enzyme populations. Examples of methods tiiat can be applied 
are the Recombination Chain Reaction (RCR) according to patent application WO 0134835, the DNA-ShufflIng method 
according to the patent application WO 9522625. the Staggered Extension method according to patent WO 9842728, 
orthe Random Priming recombination according to patent application W09842728. Furthermore, also methods for non- 

^ homologous recombination such as the Itchy metiiod can be applied (Ostermeier, M. et al. Nature Biotechnology 17 
(1999) 1205-1209). 

Upon random insertion of a nucleotide sequence into the protein scaffold one obtains a library of different genes encoding 
enzyme variants. The polynucleotide library is subsequently transferred to an appropriate expression vector. Upon 
expression in a suiteble host or by use of an in vitro expression system, a library of enzymes conteining randomly inserted 

45 stretches of amino acids is obtained. 

[0144] According to step (b) of this third aspect, one or more fully or partially random peptide sequences are inserted 
Into the protein scaffold. The actual number of such inserted SDRs is determined by the intended quantitative specificity 
following the relation: the higher the intended spectfictty is, the more SDRs are inserted. Whereas a single SDR enables 
the generation of moderately specific enzymes, two SDRs enable already the generation of significantiy specific enzymes. 

so However, up to six and more SDRs can be inserted Into a protein scaffold. A similar relation is valid for the length of the 
SDRs: the higher the intended specificity Is, the longer are the SDRs that are to be inserted. SDRs can be as short as 
one to four amino add residues. They can. however, also be as long as 50 amino acid residues. Significant specificity 
can already be generated by the use of SDRs of a length of four to six amino acid residues. 

[0145] The peptid sequences that are inserted can be fully or partially random. In this context, fully random means 
55 that a set of sequences are Inserted in parallel that includes sequences that differ from each other in each and every 
position. Partially random means that a set of sequences are inserted in parallel that includes sequences that differ from 
each other in at least one position. This difference can be either pair-wise or with respect to a single sequence. For 
example, when regarding an insertion of the length of four amino acids, partial random could be a set (I) that includes 
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AGGG. GVGG. GGLG, GGGl, or (ii) that Includes AGGG, VGGG, LGGG and IGGG. Alternatively, random sequences 
also comprises sequences that differ from each other In length. Randomization of the peptide sequences is achieved 
by randomization of the nucleotide sequences that are inserted into the gene at the respective sites. Thereby, randomi- 
zation can t>e achieved by employing mixtures of nucleobases as monomers during chemical synthesis of the oligonu- 

5 cleotides. A particularly preferred mixture of monomers for a fully random codon that in addition minimizes the probability 
of stop codons is NN(GTC). Alternatively, random oligonucleotides can be obtained by fragmentation of DNA into short 
fragments that are inserted into the gene at the respective sites. The source of the DNA to be fragmented may be a 
synthetic oligonucleotide but alternatively may originate from cloned genes. cDNAs, or genomic DNA. Preferably, the 
DNA is a gene encoding an enzyme. The fragmentation can, for example, be achieved by random endonucleo lytic 

10 digestion of DNA. Preferably, an unspecific endonuclease such as DNAse I (e.g. from bovine pancreas) is employed 
for the endonudeolytic digestion. 

[0146] If steps (a) - (c) of the method are repeated cyclically, there are different alternatives for obtaining random 
peptide sequences that are inserted in consecutive rounds. Preferably, SDRs that were identified in one round as leading 
to increased specificity of enzyme are used as templates for the random peptide sequences that are inserted in the 
IS following round. 

[0147] In a preferred altemative, the sequences selected In one round are analysed and randomized oligonucleotides 
are generated based on these sequences. This can, for example, be achieved by using in addition to the original 
nucleotide with a certain percentage mixtures of the other three nucleotides monomers at each position in the oligonu- 
cleotide synthesis. If, for example, In a first round an SDRs is identified that has the amino acid sequence ARLT, e.g. 
20 encoded by the nucleotide sequence GCG CGC CTT ACC, a random peptide sequence inserted in this SDR site could 
be encoded by an oligonucleotide witti 70% G, 10% A, 10% T and 10% C at ttie first position, 70% C. 10% G, 10% T 
and 10% A at the second position, etc. This leads at each position approximately in 1 of 3 cases to the template amino 
add and In 2 of 3 cases to another amino acid. 

In another preferred altemative, the sequences selected In one round are analyzed and a consensus library is generated 
25 based on these sequences. This can, for example, be achieved by using defined mixtures of nucleotides at each position 
in the oligonucleotide synthesis in a way that leads to mixtures of the amino add residues that were identified at each 
position of the SDR selected in the previous round. If, for example, in a first round two SDRs are identified that have the 
amino add sequences ARLT and VPGS, a consensus library inserted in this SDR site in the following round could be 
encoded by an otigonudeotide with the sequence G(C/T)G C(G/C)C (G/T)(G/T)G (A/T)CC. This would correspond to 
30 the random peptide sequence (AA/)(R/P)(L7GA//W)(T/S), thereby allowing all combinations of the amino acid residues 
identified In the first round, and, due to tiie degeneracy of the genetic code, allowing in addition to a lower degree 
altemative amino acid residues at some positions. 

[01 48] In another preferred altemative, the sequences selected in one round are, without previous analysis, recombined 
using methods for the in vitro recombination of polynucleotides, such as the methods described in WO 01/34835 (the 

35 following also provides details of the eighth and ninth aspect). 

[0149] After insertion of the partially or fully random sequences into the gene encoding the scaffold protein, and 
eventually ligation of the resulting gene Into a suitable expression vector using standard molecular doning techniques 
(Sambrook. J.F; Fritsch. E.F.; Maniatis.T.; Cold Spring Harbor Laboratory Press, Second Edition, 1989, New York), the 
vector is introduced in a suitable expression host cell which expresses the corresponding enzyme variant. Particulariy 

^ suitable expression hosts are bacterial expression hosts such as Escherichia coli or Badllus subtilis, or yeast expression 
hosts such as Saccharomyces cerevisae or Pichia pastoris, or mammalian expression hosts such as Chinese Hamster 
Ovary (CHO) or Baby Hamster Kidney (BHK) cell lines, or viral expression systems such as bacteriophages like Ml 3 
T7 phage or Lamkxia. or viruses such as the Baculovirus expression system. As a further altemative. systems for in vitro 
protein expression can be used. Typically, the DNA is ligated into an expression vector behind a suitable signal sequence 

<5 that leads to secretion of the enzyme variants into the extracellular space, thereby allowing direct detection of enzyme 
activity In the cell supernatant. Particulariy suitable signal sequences for Escherichia coll are ompA, pelB, HlyA, for 
Badllus subtilis AprE, NprB, Mpr, AmyA, AmyE, Blac. SacB, and for S. cerevisiae Barl, Suc2, Mata. lnu1A, Ggplp. 
Alternatively, the enzyme variants are expressed intracellulariy and the substrates are expressed also Intracellulariy. 
According to protease variants this is done essentially as described in patent application WO 0212543, using a fusion 

50 peptide substrate comprising two auto-fluorescent proteins linked by the substrate amino-add sequence. As a further 
altemative, after intracellular expression of the enzyme variants, or secretion into the periplasmatic space using signal 
sequences such as DsbA, PhoA. PelB, OmpA. OmpT or gill for Escherichia coli. a permeabilisation or lysis step releases 
the enzyme variants into the supematant. The destruction of the membrane barrier can be forced by the use of mechanical 
means such as ultrasonic, French press, or the use of membrane-digesting en2:ymes such as lysozyme. As another. 

55 further altemative. the genes encoding the enzyme variants are expressed cell-free by the use of a suitable cell-free 
expression system. For example, the S30 extract from Escherichia coli celts is used for this purpose as described by 
Lesly et al. (Metiiods in Molecular Biology 37 (1995) 265-278). 

[0150] After introduction of the vector Into host cells, these cells are screened for the expression of enzymes with 
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specificity for the intended target substrate. Such screening is typically done by separating the cells from each other, in 
order to enable the correlation of genotype and phenotype, and assaying the activity of each cell done after a growth 
and expression period. Such separation can for example be done by distribution of the cells into the compartments of 
sample carriers, e.g. as described in WO 01/24933. Altematively. the cells are separated by streaking on agar plates, 

s by enclosing in a polymer such as agarose, by filling into capillaries, or by similar methods. 

Identification of variants with the intended specificity can be done by different approaches. In the case of proteases, 
preferably assays using peptide substrates essentially as described in PCT/EP03/04864 are employed. 
[0151] Regardless of the expression format, selection of enzyme variants is done under conditions that allow identi- 
fication of enzymes that recognize and convert the target sequence preferably. As a first altemative, enzymes that 

io recognize and convert the target sequence preferably are identified by screening for enzymes with a high affinity for the 
target substrate sequence. High affinity corresponds to a low K^^ which is selected by screening at target substrate 
concentrations substantially below the of the first enzyme. Preferably, the substrates that are used are linked to one 
or more fluorophores that enable the detection of the modification of the substrate at concentrations below 10 yM, 
preferably below 1 p.M, more preferably below 100 nM, and most preferably below 10 nM. 

15 [0152] As a second altemative, enzymes that recognize and convert the target substrate preferably are identified by 
employing two or more substrates in the assay and screening for activity on these two or more substrates in comparison. 
Preferably, the two or more substrates employed are linked to different marker molecules, thereby enabling the detection 
of the modification of the two or more substrates consecutively or in parallel. In the case of proteases, particulariy 
preferably two peptide substrates are employed, one peptide substrate having an arbitrarily chosen or even partially or 

20 fully random amino-add sequence thereby enabling to monitor the activity on an arbitrary substrate, and the other peptide 
substrate having an amino-acid sequence Identical to or resembling the intended target substrate sequence thereby 
enabling to monitor the activity on the target substrate. Espedally preferably, these two peptide substrates are linked to 
fluorescent marker molecules, and the fluorescent properties of the two peptide substrates are sufficientiy different in 
order to distinguish both activities when measured consecutively or in parallel. For example, a fusion protein comprising 

25 a first autofluorescent protein, a peptide, and a second autofluorescent protein according to patent application WO 
0212543 can be used for this purpose. Altematively. fluorophores such as rhodamines are linked chemically to the 
peptide substrates. 

[0153] As a third altemative. enzymes that recognize and convert the target substrate preferably are Identified by 
employing one or more substrates resembling the target substrate together with competing substrates in high excess. 

30 Screening with respect to activity on the substrates resembling the target substrate is then done in the presence of the 
competing substrates. Enzymes having a spedfidty which corresponds qualitatively to the target spedfidty, but having 
only a low quantitative specifidty are identified as negative samples in such a screen. Whereas enzymes having a 
spedfidty which corresponds qualitatively and quantitatively to the target spedfidty are identified positively. Preferably, 
the one or more substrates resembling the target substrate are linked to marker molecules, thereby enabling the detection 

35 of their modifications . whereas the competing substrates do not carry marker molecules. The competing substrates have 
arbKrarity chosen or random amIno-add sequences, thereby acting as competitive inhibitors for the hydrolysis of the 
marker-carrying substrates. For example, protein hydrolysates such as Trypton can sen^e as competing substrates for 
engineered proteolytic enzymes. 

As a fourth altemative, enzymes that recognize and convert the target substrate preferably are identified and selected 
40 by an amplification-ooupled or growth-coupled selection step. Furthermore, the activity can be measured intracellularily 
and the selection can be done by a cell sorter, such as a fluorescence^ctivated cell sorter. 

[01 54] As a further altemative, enzymes that recognize and convert the target substrate are Identified by first selecting 
enzymes that preferentially bind to the target substrate, and secondly selecting out of this subgroup of enzyme variants 
those enzymes that convert the target substrate. Selection for enzymes that preferentially bind the target substrate can 

45 be either done by selection of binders to the target substrate or by counter-selection of enzymes that bind to other 
substrates. Methods for the selection of binders or for the counter-selection of non-binders is known in the art. Such 
methods typically require phenotype-genotype coupling which can be solved by using surface display expression meth- 
ods. Such methods indude, for example, phage or viral display, cell surface display and In vitro display. Phage or viral 
display typically Involves fusion of the protein of interest to a viral/phage protein. Cell surface display, i.e. either bacterial 

50 or eukaryotic cell display, typically Involves fusion of the protein of Interest to a peptide or protein that Is located at the 
cell surface. In in-vltro display, the protein Is typically made in vitro and linked direcUy or indirectly to the mRNA encoding 
the protein (DE 19646372). 

[01 55] The disclosure also provides for a composition or pharmaceutical composition comprising one or more engi- 
neered enzymes according to the first aspect as defined herein before. The composition may optionally comprise an 
55 acceptable carrier, exdpient and/or auxiliary agent. Non-pharamceutical compositions as defined herein are research 
composition, nutritional composition, deaning composition, desinfection composition, cosmetic composition or compo- 
sition for personal care. Moreover, DNA sequences coding for the engineered enzyme as defined herein before and 
vectors containing said DNA sequences are also provided. Finally, transformed host cells (prokaryotlc or eukaryotic) or 
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transgenic organisms containing such DNA sequences and/or vectors, as well as a method utilizing such host cells or 
transgenic animals for producing the engineered enzyme of the first aspect are also contemplated. 

Detailed description of the figures 

[0156] 

Figure 1: Three-dimensional structure of human trypsin I with the active site residues shown in "ball-and-stick" 
representation and with the marked regions indicating potential SDR insertion sites. 



Figure 2: Alignment of the primary amino acid sequences of the human proteases trypsin I, alpha-thrombin and 
enteropeptidase all of which belong to the structural class S1 of the serine protease family. Trypsin represents an 
unspecific protease of this structural class, while alpha-thrombin and enteropeptidase are proteases with high sul>- 
strate specificity. Compared to trypsin several regions of insertions of three or more amino acids into the primary 
15 sequence of a-thrombin and enterokinase are seen. The region marked with (-1-) and the region marked with (-3-) 

are preferred SDR insertion sites, in the tertiary structure of alpha-thrombin both regions are in the vicinity of the 
substrate binding site. These regions therefore fullfil two criteria to be selected as candidates for SDRs: firstiy, they 
represent insertions in the specific proteases compared to the unspecific one an6, secondly, they are close to the 
substrate binding site. A representation of the three-dimensional structure Is given in figure 3. 

20 

Figure 3: Three-dimensional structure of subtilisin with the active site residues being shown in "ball-and-stick" 
representation and with the numbered regions indicating potential SDR insertion sites. 

Figure 4: Alignment of the primary amino acid sequences of subtilisin E, furin, PC1 and PC5 all of which belong to 
25 the structural class S8 of the serine protease family. Subtilisin E represents an unspecific protease of this structural 

class, while furin, PC1 and PCS are proteases with high substrate specificity. Compared to subtilisin several regions 
of insertions of three or more amino adds Into the primary sequence of furin, PC1 and PC5 are seen. The regions 
marked with (-4-), (-5-), (-7-), (-9-) and (-11-) are preferred SDR insertion sites. These regions stretches fulfill two 
criteria to be selected as candidates for SDRs: firstiy, they represent insertions in the specific proteases compared 
30 to the unspecific one and, secondly, they are close to the active site residues. 

Figure 5: Three-dimensional structure of beta-secretase with the active site residues being shown in "ball-and-stick" 
representation and with the numbered regions indicating potential SDR insertion sites. 

35 Figure 6: Alignment off the primary amino acid sequences of pepsin, t>-secretase and cathepsin D. all of which betong 

to the structural dass A1 of the aspartic protease family. Pepsin represents an unspecific protease of this structural 
dass, while b-secretase and cathepsin D are proteases with high substrate spedfidty. Compared to pepsin several 
regions of insertions of three or more amino adds into the primary sequence of b-secretase and cathepsin D are 
seen. The regions marked with -1 - to -1 1 - correspond to possible SDR combining sites and are also marked in Fig. S. 

40 

Figure 7: illustrates the three-dimensional structure of caspase 7 with the active site residues t>eing shown in "ball- 
and-stick" representation and with the numbered regions indicating potential SDR insertion sites. 

Figure 8: shows the primary amino add sequence of caspase 7 as a member of the cysteine protease dass C14 
45 family (see also SEQ ID NO: 14). 



Figure 9: Schematic representation of method according to the third aspect. 



Figure 10: Westem blot analysis of trypsin expression. Supematant of cell cultures expressing variants of trypsin 
50 are compared to negative controls. Lane 1 : molecular weight standard; lane 2: negative control; lane 3: supematant 

of variant a; lane 4: negative control; lane 5: supematant of variant b. A primary antibody specific to the expressed 
protein and a secondary antibody for generation of the signal were used. 

Figure 1 1 : Time course of the proteolytic deavage of a target substrate. Supematant of cells containing the vector 
55 with the gene for human trypsin and that of cells containing the vector without the gene was incubated with the 

peptide substi-ate described in the text. Cleavage of the peptide results in a decreased read out value. Proteolytic 
activity is confirmed for the positive clone. 
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Figure 12: Relative activity of three engineered proteolytic enzymes in comparison with human trypsin I on two 
different peptide substrates. A time course of the proteolytic digestion of the two substrates was performed and 
evaluated. Substrate B was used for screening and substrate A Is a closely related sequence. Relative activity of 
the three variants was normalized to the activity of human trypsin I. Variant 1 and 2 cleariy show Increased specificity 
s towards the target substrate. Variant 3, on the other hand, serves as a negative control with similar activities as the 

human trypsin I. 

Figure 13: Relative specificities of trypsin and variants of engineered proteolytic enzymes with one or two SDRs, 
respectively. Activity of the proteases was determined in the presence and absence of competitor substrate, i.e. 
10 peptone at a concentration of lOmg/ml. Time courses for the proteolytic cleavage were recorded and the time 

constants k determined. The ratios between the time constants with and without competitor were formed and rep- 
resent a quantitative measure for the specificity of the protease. The ratios were normalized to trypsin. The specificity 
of the variant containing two SDRs is 2.5 fold higher than that of the variant with SDR2 alone. 

15 Figure 14: Shows the relative specificities of protease variants in absence and presence of competitor substrate. 

The protease variants containig two inserts with different sequences and the non-modified scaffold human trypsin 
I were expressed in a suitable host. Activity of the protease variants was determined as the cleavage rate of a 
peptide with the desired target sequence of TNF-alpha in the absence and presence of competitor substrate. Spe- 
cificity is expressed as the ratio of cleavage rates In the presence and absence of competitor. 

20 

Figure 1 5: The figure shows the reduction of cytotoxicity induced by human TNF-alpha when incubating the human 
TNF-alpha with concentrated supernatant from cultures expressing the engineered proteolytic enzymes being spe> 
cific for human TNF-alpha. This indicates the efFicacy of the engineered proteolytic enzymes. 

25 Figure 16: The figure shows the reduction of cytotoxicity induced by human TNF-alpha when incubating the human 

TNF-alpha with different concentrations of purified engineered proteolytic enzyme being specific for human TNF- 
alpha. Variant g comprises Seq ID No:72 as SDR1 and Seq ID No:73 as SDR2. This Indicates the efficacy of the 
engineered proteolytic enzymes. 

^ Figure 1 7: The figure compares the activity of engineered proteolytic enzymes being specific for human TNF-alpha 

with the activity of human trypsin I on two protein substrates: (a) human TNF-alpha; (b) mixture of human serum 
proteins. This indicates the safety of the engineered proteolytic enzymes. Variant x corresponds to Seq ID No: 75 
comprising the SDRs according to Seq ID No. 89 (SDR1 ) and 95 (SDR2). Variants xi and xii correspond to derivatives 
thereof comprising the same SDR sequences. 

35 

Figure 18: Specific hydrolysis of human VEGF by an engineered proteolytic enzyme derived from human trypsin. 
Examples 

40 [01 57] In the following examples, materials and methods of the present invention are provided including the determi- 
nation of catalytic properties of enzymes obtained by the method. It should be understood that these examples are for 
illustrative purpose only and are not to be construed as limiting this invention In any manner. 

[0158] In the experimental examples descrit>ed t>elow. standard techniques of recombinant DNA technology were 
used that were descrit)ed in various publications, e.g. Sambrook et al. (1989), Molecular Cloning: A Lat>oratory Manual, 
45 Cold Spring Hartx)r Laboratory, or Ausubel et al. {1987), Current Protocols in Molecular Biology 1987-1988, Wiley 
Interscience. Unless othen^se indicated, restriction enzymes, polymerases and other enzymes as well as DNA purifi- 
cation kits were used according to the manufacturers specifications. 

Example I: Identification of SDR sites in human trypsin 

so 

[0159] Insertion sites for SDRs have been Identified in the serine protease human trypsin I (structural class SI) by 
comparison with members of the same structural class having a higher sequence specificity. Trypsin represents a 
member with low substrate specificity, as it requires only an arginine or lysine residue at the position. On the other 
hand, thrombin, tissue-type plasminogen activator or enterokinase all have a high specificity towards their substrate 
55 sequences, i.e. (L7IA//F)XPR'^NA, CPGR^WGG and DDDK'^. respectively. The primary sequences and tertiary structures 
of these and further S1 serine proteases have been aligned in order to determine regions of low and high sequence and 
structure homology and especially regions that correspond to insertions in the sequences of the more specific proteases 
(Figure 2). Several regions of insertions equal or longer than 3 amino adds representing potential SDR sites have been 



27 



EP 1 633 865 B1 

identified as indicated in Figure 1 . These regions were chosen as target sites for the insertion of SDRs in the examples 
below, e.g. SDR1 (region one in figure 2. after amino acid 42 according to SEQ ID NO:1) with a length of six and SDR2 
(region three In figure 2, after amino add 123 according to SEQ ID NO:1) with a length of five amino adds, respectively. 

Example II: Molecular atoning of the human trypsins I gene to be used as scaffold protein and expression of the mature 
protease in B. subtills 

[0160] The gene encoding the unspecific protease human trypsinogen I was doned into the vector pUC18. Cloning 
was done as follows: the coding sequence of the protein was amplified by PCR using primers that introduced a Kpnl 
site at the 5' end and a BamHI site at the 3' end. This PCR fragment was doned into the appropriate sites of the vector 
pUC1 8. Identity was confimned by sequendng. After sequendng the coding sequence of the mature protein was amplified 
by PCR using primers that introduced different Bgll sites at the 5* end and the 3' end. 

This PCR fragment was doned into the appropriate sites of an E. coli - B. subtilis shuttle vector. The vector contains a 
pMB1 origin for amplification in E. coli, a neomycin resistance marker for selection in E. coli, as well as a P43 promoter 
for the constitutive expression in B. subtilis. A 87 bp fragment that contains the leader sequence encoding the signal 
peptide from the sacB gene of B. subtilis was introduced behind the P43 promoter. Different Bgll restriction sites serve 
as insertion sites for heterologous genes to be expressed. 

Expression of human trypsin I was confirmed by measurement of the proteolytic adtidty in supematant of cells containing 
the vector with the gene in comparison to a negative control. A peptide including an arginine deavage site was chosen 
as a substrate. The peptide was N-temiinally biotinylated and labeled with a fluorophore at the C-terminus. After incubation 
of the peptide with culture supematant streptavidin was added. Undeaved peptide assodate with streptavidin and lead 
to a high read out value while deavage results in low read out values. Figure 1 1 shows the time course of a proteolytic 
digestion of B. subtilis cells containing the vedor with the trypsin I gene in comparison to B. subtilis cells containing the 
vector without the trypsin I gene (negative control). As a further confimnation of expression of the protease, supematants 
of cells containing the vedor with the gene and control cells were analyzed by polyacrylamid gel electrophoreses and 
subsequent western blot using an antibody spedfic to the target protease. The procedure was performed according to 
standard methods (Sambrook, J.F; Fritsch, E.F.; Maniatis.T.; Cold Spring Harbor Laboratory Press, Second Edition, 
1989, New York). Figure 8 confirms expression of the protein only in the cells harbouring the vector with the gene for 
trypsin. 

Example III: Providing a scaffold protein 

[0161] In this example, human trypsin I was used as the scaffold protein. The gene was either used in its natural form, 
or, alternatively, was modified to result in a scaffold protein with increased catalytic activity or further improved charac- 
35 teristics. 

The modification was done by random modification of the gene, followed by expression of the enzyme and subsequent 
selection for increased activity. First, the gene was PCR amplified under error>prone conditions, essentially as described 
by Cadwell, R.C and Joyce, G.F. (PCR Methods Appl. 2 (1992) 28-33). Error-prone PCR was done using 30 pmol of 
each primer, 20 nmol dGTP and dATP, 100 nmol dCTP and dTTP, 20 fmol template, and 5 U Taq DNA polymerase in 

40 10 mM Tris HCI pH 7.6. 50 mM KCI, 7 mM MgC12, 0.5 mM MnCI2, 0.01 % gelatin for 20 cycles of 1 min at 94 X, 1 min 
at 65 ''C and 1 min at 72 °C. The resulting DNA library was purified using the Qiaquick PCR Purification Kit following 
the suppliers' instructions. The PCR product was digested with the restriction enzyme and purified. Afterwards, the 
PCR product was llgated into the E. coli - B. subtilis shutUe vector described above which was digested with Bgll and 
dephosphorylated. The ligation products were transformed into E. coli, amplified in LB, and the plasmids were purified 

45 using the Qiagen Plasmid Purification Kit following the suppliers' instructions. Resulting plasmids were transformed into 
B. subtilis cells. 

Alternatively, or in addition to random mutagenesis, variants of the gene were statistically recombined at homologous 
positions by use of the Recombination Chain Reaction, essentially as described in WO 0134835. PCR products of the 
genes encoding the protease variants were purified using the QIAquick PCR Purification Kit following the suppliers' 

50 instructions, checked for correct size by agarose gel electrophoresis and mixed together In equimolar amounts. 80 fjig 
of this PCR mix in 150 mM TrisHCI pH 7.6, 6.6 mM MgCl2 were heated for 5 min at 94 '*C and subsequently cooled 
down to 37 '*C at 0.05 X/s in order to re-anneal strands and thereby produce heteroduplices in a stochastic manner. 
Then, 2.5 U Exonuclease ill per jig DNA were added and incubated for 20. 40 or 60 min at 37 *C in order to digest 
different lengths from both 3' ends of the heteroduplices. The partly digested PCR products were refilled with 0.6 U Pfu 

55 polymerase per ji.g DNA by incubating for 15 min at 72 'C in 0.17 mM dNTPs and Pfu polymerase buffer according to 
the suppliers' instrudions. After performing a single PCR cyde, the resulting DNA was purified using the QIAquick PCR 
Purification Kit following the suppliers' instructions, digested with Bgll and ligated into the linearized vector. The ligation 
products were transformed into E. coli. amplified in LB containing ampidllin as marker, and the plasmids were purified 
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using the Qiagen Plasmid Purification Kit following the suppliers' instructions. Resulting plasmids were transfomried into 
B. subtilis celts. 

Exanriple IV: Insertion of SDRs into the protein scaffold of human trypsin I and generation of an engineered proteolytic 

s enzyme vyith specificity for a peptide substrate having the sequence KKWLGRVPGGPV. 

[0162] In order to create insertion sites for SDRs in human trypsin I, two pairs of different restriction sites were introduced 
into the gene at sites that were identified as potential SDR sites (see Example I at>ove) without changing the amino acid 
sequence. The insertion of the restriction sites was done by overiap extension PGR. Primers restrl and restr2 were 
10 used for the Introduction of Sad I and BamHI restriction sites. restrS and restr4 were used for the introduction of Kpnl 
and Nhel restriction sites. The sequences of the primers were as follows: 

Binding site for restrl and restr2 and the corresponding amino add sequence (SEQ ID NO:64): 

IS 

5 • -GGTGGTAT CAGCAG GCCACTGCTACAAGTCCC GCATCC AGGT-3 • 
VVSAGHCYKSRIQ 



20 Fonvard primer restrl (SEQ ID NO:56): 

5'-GGTGGTATCCGCGGGCCACTGCTACAAGTCCCGGATCCAGGT-3' 



Reverse primer restr2 (SEQ ID NO: 57): 

5-ACCT GGATCC GGGACTTGTAGCAGTGG CCCGCGGA TACCAGC-3' 

25 

Binding site for restr3 and restr4 and the corresponding amino add sequence (SEQ ID NO:58): 



5 ' -CCACT GGCACGA AGTGCCTCATCTCTGGCTGGGGCAACAC TGCGAGC TCT-3 ' 

30 

TGTKCLI SGWGNTASS 



Forward primer restr3 (SEQ ID NO:60): 
35 5-CCACTGGCACGAAGTGCCTCATCTCTGGCTGGGGCAACACTGCGAGCTCT-3* 



Reverse primer restr4 (SEQ ID NO:61): 

5'-AGAGCTAGCAGTGTTGCGCCAGCCAGAGATGAGGCACTTGGTACCAGTGG-3' 



^ [0163] In a first overiap extension PGR. the Sacll/BamHI sites were introduced, enabling to insert SDR1, and in a 
second overiap extension PGR the KpnI/Nhel sites, enabling the insertion of SDR2. The product of the overiap extension 
PGR was amplified using primers pUG-forward and pUG-reverse. The sequences of pUG-forward and pUG-reverse are 
as follows: 

45 pUG-fon<vard (SEQ ID NO:62): 5'-GGGGTACCGGAGGAGGATGAATCCAGTCCT-3* 

pUG-reverse (SEQ ID NO:63): 5'-GGGGATCCGGTATAGAGAGTGAAGAGATAG-3' 

[0164] The restriction sites generated thereby were subsequently used to insert defined or random oligonudeotides 
into the SDR1 SDR2 insertion sites by standard restriction and ligation methods. Typically, two complementary synthetic 
so 5'-phosphorylated oligonudeotides were annealed and ligated into a vector carrying the modified human trypsin I gene 
that was deaved with the respective restriction enzymes. Oligonudeotides encoding SDR1 were inserted via the Sacll/ 
BamHI sites whereas oligonudeotides encoding SDR2 were inserted via the KpnI/Nhel sites. For each insertion an 
oligonudeotide pair according to the following general sequences was used ([P] indicating 5'-phosphorylation. N and X 
indicating any nudeotide or amino add residue, respectively): 

55 

ollgox^DRIf (SEQ ID NO:64): 

5MP1-GGGGGAGTGGTAG NNNNNNNNNNNNNNNNNNA AGTGGGG-3' 
oligox-SDRIr (SEQ ID NO:66): 
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3 • -C6CCCGGTGACGATG NNNNNNNNNNNNNNNNNNT TCAGGGCCTAG- [ P 1 -5 • 
G H C Y X X X X X X K S 

5 

oligox-SDR2f (SEQ ID NO: 67): 

5'-fP1>CAAGTGCCTCATCTCTGGCTGGGGCAAC NNNNNNNNNNNNNNNA CTG-3' 
oligox-SDR2r (SEQ ID NO:69)-. 

10 

3 " -CATGGTTCACGGAGTAGAGACCGACCCCGTTG NNNNNNNNNNNNNKN TGACGATC- [P] - 5 ' 
KCLI SGWGN X X X X X T 

IS 

[01 65] As an alternative to the above method, a PGR based method was used for the integration of random-sequences 
into the SDR1 and SDR2 insertion sites in the modified human trypsin 1 . For each SDR. one primer was used where 
the SDR region is fully randomized. Sequences of the primers were as follows (N = A/C/GH', B = C/G/T, V = A/C/G): 

20 

Primer SDR1-mutnnb-forward (SEQ ID NO:70): 

5'-TGGTATCCGCGGGCCACTGCTACNNBNNBNNBNNBNNBNNBAAGTCCCGGATCCAGGTG-3' 

Primer SDR2-mutnnb-reverse (SEQ ID NO:71): 
25 5'-GGGGCCAGAGCTAGCAGTVNNVNNVNNVNNVNNGTrGCCCCAGCCAGAGATG-3' 

[0166] The codon NNB, or VNN in the reverse strand, allows all 20 amino acids to made, but reduces the probability 
of encoding a stop codon from 0.047 to 0.021 . 

[0167] As a further altemative, after identification of SDRs that lead to increased specificity, these SDRs were used 
30 as templates for further randomization. 

Thereby, random peptide sequences were inserted that were partially randomized at each position and partially identical 
at each position to the original sequence. 

[0168] As an example, random peptide sequences that have in approximately 1 of 3 cases the template amino acid 
residue and in approximately 2 of 3 cases any other amino add residue at each position were Inserted into the two SDR 

35 insertion sites of the modified human trypsin I. For this purpose, primers that contain at each nucleotide position of the 
SDR approximately 70% of the template bases and 30% of a mixture of the three other bases were used. 
With each primer pair a PGR was performed under standard conditions using the human trypsin I gene as template. 
The resulting DNA was purified using the QIAquick PGR Purification Kit following the suppliers' instructions and digested 
with Sad I and Nhel. After digestion the DNA was purified and ligated into the Sad I and Nhel digested and dephospho- 

40 rylayted vector. The ligation products were transformed into E. coli, amplified in LB containing the respective marker, 
and the plasmids were purified using the Qiagen Plasmid Purification Kit following the suppliers* instructions. Resulting 
plasmids were transformed Into B. subtilis cells. These cells were then separated to single cells, grown to clones, and 
after expression of the protease gene screened for proteolytic activity. 

The following substrates were employed for screening for proteolytic activity (SEQ ID NOs:76 and 77): 

45 
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substrate A 
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•<f.vtfm 
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P 


V 



[0169] Protease variants were screened on substrate B at complexities of 10^ variants by confocal fluorescence 
spectroscopy. The substrate was a peptide biotinylated at the N-terminus and fluorescently labeled at the C-terminus. 
After incubation of the peptide with supematant of cells expressing different variants of the protease, streptavidin is 
55 added and the samples are analysed by confocal fluorimetry. The low concentration of the peptide (20nM) leads to a 
preferential cleavage by proteases with a high k^^g^M value, i.e. proteases with high spedficity towards the target 
sequence. 

[0170] Variants selected in the screening procedure were further evaluated for their specifidty towards substrate B 
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and dosely related substrate A by measuring time courses of the proteolytic digestion and determining the rate constants 
which are proportional to the ^x/^m values. Cleariy, compared to the human trypsin that was used as scaffold protein, 
the specific activity of variants 1 and 2 is shifted (SEQ ID NOs: 2 and 3, respectively) towards substrate B. Variant 3 
(SEQ ID NO:4), on the other hand, serves as a negative control with similar activities as the human trypsin 1 . Sequencing 
of the genes of the three variants revealed the following amino acid sequences in the SDRs. 

Table 2 : Sequences of the two SDRs in three different variants selected for 
specific hydrolysis of substrate B (SEQ ID NOs: 78-83). 
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[0171] In a further experiment a pool of variants containing different numk>ers of SDRs per gene were screened for 
increased specificity using a mixture of the defined substrate and pepton as a competing substrate. Variants containing 
one or two SDRs per gene have been analyzed further. As a measure for the specificity the activity in the peptide 
cleavage assay was compared with and without the presence of the competing substrate. The concentration of the 
competing substrate was 10mg/ml. Under these conditions, unspedfic proteases show, compared to specific proteases, 
a stronger decrease in activity with increasing competitor concentrations (range between 0 and 100mg/ml). The ratio of 
proteolytic activity with and without substrate is a quantitative measure for the specificity of the proteases. Figure 9 
shows the relative activities with and without competing substrate. Human trypsin I that was used as the scaffold protein 
and two variants, one containing only SDR2, and one containing both SDRs, were compared. The specificity of the 
variant with both SDRs is by a factor of 2.5 higher than that of the variant with SDR2 only, confirming that there is a 
direct relation between the number of SDRs and the quantitative specificity of resulting engineered proteolytic enzymes. 

Example V: Generation of an engineered proteolytic enzyme that specifically inactivates human TNF-alpha 

[0172] Human trypsin alpha I or a derivative comprising one or more of the following amino acid substitutions E56G; 
R78W; Y1 31 F; A146T; C183R was used as protein scaffold for the generation of an engineered proteolytic enzyme with 
high specificity towards human TNF-alpha. The identification of SDR sites in human trypsin I or derivatives thereof was 
done as described above. Two insertion sites within the scaffold were choosen for SDRs. The protease variants containing 
two inserts with different sequences and also the human trypsin I itself with no inserts were expressed in a Bacillus 
subtilis cells. The variant protease cells were separated to single cell clones and the protease expressing variants were 
screened for proteolytic activity on peptides with the desired target sequence of TNF-alpha. The activity of the protease 
variants was determined as the deavage rate of a peptide with the desired target sequence of TNF-alpha in the absence 
and presence of competitor substrate. The specificity is expressed as the ratio of cleavage rates in the presence and 
absence of competitor (Fig. 14). 
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Table 3 : Relative specificity of variants of engineered proteolytic enzymes with 
different SDR sequences in absence and presence of competitor substrate (SEQ 
ID NOs:84-9S). 
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[0173] The antagonistic effect of three protease variants on human TNF-alpha is shown in Figure 15. By the use of 
the variants, the induction of apoptosis is almost completely eliminated indicating the anti-inflammatory efficacy of the 
proteases to initiate TNF-alpha break down. TNF-alpha has been incubated with concentrated supernatant from cultures 
expressing the variants i to lii for 2 hours. The resulting TNF-alpha has been incubated with non-modified cells for 4 
hours. The effect of the remaining TNF-alpha activity was determined as the extent of apoptosis induction by detection 
of activated caspase-3 as marker for apoptotic cells. For the controts either no protease was added with the human 
TNF-alpha (dead cells) or buffer instead of human TNF-alpha (live cells) was used, respectively. An analogous experiment 
is shown in Figure 16 using purified variant xiii. TNF-alpha was incubated with different concentrations of the purified 
protease variant. 

[0174] To demonstrate the specificity of the protease variants, proteins from human blood serum or purified human 
TNF-alpha have been incubated with human trypsin I or the engineered proteolytic enzyme variants, respectively. Here, 
variant x con-esponds to Seq ID No: 75 comprising the same SDRs as variant f. i.e. SDRs according to Seq ID No. 89 
(SDR1) and 95 (SDR2). Variants xi and xli correspond to derivatives thereof comprising the same SDR sequences. 
Remaining intact protein was was determined as a function of time. While the variants as well as human trypsin I digest 
human TNF-alpha, only trypsin shows activity on serum protein (Figure 17 a and b). This demonstrates the high TNF- 
alpha specificity of the proteolytic enzymes and indicates tiieir safety and accordingly their low side effects for therapeutic 
use. 
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^ Example VI: Generation of an engineered proteolytic enzyme that specifically hydrolysis human VEGF. 

[01 75] Human trypsin I was used as protein scaffold for the generation of an engineered proteolytic enzyme with high 
specificity towards human VEGF. The identification of SDR sites in human trypsin I was done as described above. Two 
insertion sites within the scaffold were choosen for SDRs. The protease variants containing two inserts with different 
sequences were expressed in Bacillus subtilis ceWs. The variant protease cells were separated to single ceil clones and 
the protease expressing variants were screened as described above. The activity of the protease variants was determined 
as the rate of VEGF cleavage. 4p.g of recombinant human VEGF165 was incubated with 0.18 p.g of purified protease 
in PBS / pH 7.4 at room temperature. Aliquots were taken at the indicated time points and analysed on a polyacrylamide 
gel. The extend of cleavage was quantified by densitometric analysis of the bands. The activity is plotted over incubation 
time in Figure 18. Specific cleavage was controlled by further SDS polyacrylamide gel analyses. 

SEQUENCE LISTING 
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[0176] 



<1 10> DIREVO Biotech AG 



<120> NEW BIOLOGICAL ENTITIES AND USE THEREOF 
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<130> 041480WO JH/cw 
<160> 96 

<170> Patentin version 3.1 

<210> 1 
<211>224 
<212> PRT 
<213> Homo sapiens 

<400>1 



lie Val Gly Gly 
1 

Ser Leu Asn Ser 

20 

Gin Trp Val Val 
35 

Ar9 Leu Gly Glu 
50 

lie Asn Ala Ala 
65 

Leu Asn Asn Asp 

Asn Ala Arg Val 

100 

Gly Thr Lys Cys 
115 

Ala Asp Tyr Pro 
130 

Gin Ala Lys Cys 
145 

Phe Cys val Gly 
Ser Gly Gly Pro 



Tyr Asn Cys Glu 
5 

Gly Tyr His Phe 

Ser Ala Gly His 

40 

His Asn lie Glu 
55 

Lys lie lie Arg 
70 

lie Met Leu lie 
85 

Ser Thr lie Ser 

Leu lie Ser Gly 

120 

Asp Glu Leu Gin 
135 

Glu Ala Ser Tyr 
150 

Phe Leu Glu Gly 
165 

Val Val Cys Asn 



Glu Asn Ser Val 
10 

Cys Gly Gly Ser 

25 

Cys Tyr Lys Ser 

Val Leu Glu Gly 

60 

His Pro Gin Tyr 
75 

Lys Leu Ser Ser 
90 

Leu Pro Thr Ala 
105 

Trp Gly Asn Thr 

Cys Leu Asp Ala 

140 

Pro Gly Lys lie 
155 

Gly Lys Asp Ser 
170 

Gly Gin Leu Gin 



Pro Tyr Gin Val 
15 

Leu lie Asn Glu 
30 

Arg He Gin Val 
45 

Asn Glu Gin Phe 

Asp Arg Lys Thr 

80 

Arg Ala Val He 
95 

Pro Pro Ala Thr 
110 

Ala Ser Ser Gly 
125 

Pro Val Leu Ser 

Thr Ser Asn Met 

160 

Cys Gin Gly Asp 
175 

Gly Val Val Ser 



180 

Trp Gly Asp Gly Cys Ala Gin 
195 

Val Tyr Asn Tyr Val Lys Trp 
210 215 

<210> 2 



185 190 
Lys Asn Lys Pro Gly Val Tyr Thr Lys 
200 205 
He Lys Asn Thr lie Ala Ala Asn Ser 

220 
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<211>235 
<212> PRT 

<213> artificial sequence 
<220> 

5 <223> trypsin variant 1 

<400>2 



10 



IS 



20 



25 



30 



35 



40 



45 



He Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Val Pro Tyr Gin Val 

15 10 15 

Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu He Asn Glu 

20 25 30 

Gin Trp Val Val Ser Ala Gly His Cys Tyr Asp Ala Val Gly Arg Asp 

35 40 45 

Lys Ser Arg He Gin Val Arg Leu Gly Glu His Asn He Glu Val Leu 

50 55 60 

Glu Gly Asn Glu Gin Phe He Asn Ala Ala Lys He He Arg His Pro 
65 70 75 80 

Gin Tyr Asp Arg Lys Thr Leu Asn Asn Asp He Met Leu He Lys Leu 

85 90 95 

Ser Ser Arg Ala Val He Asn Ala Arg Val Ser Thr He Ser Leu Pro 

100 105 110 

Thr Ala Pro Pro Ala Thr Gly Thr Lys Cys Leu He Ser Gly Trp Gly 

115 120 125 

Asn Thr He Thr Asn Ser Thr Ala Ser Ser Gly Ala Asp Tyr Pro Asp 

130 135 140 

Glu Leu Gin Cys Leu Asp Ala Pro Val Leu Ser Gin Ala Lys Cys Glu 
145 150 155 160 

Ala Ser Tyr Pro Gly Lys He Thr Ser Asn Met Phe Cys Val Gly Phe 

165 170 175 

Leu Glu Gly Gly Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Val 

180 185 190 

■ 

Val Cys Asn Gly Gin Leu Gin Gly Val Val Ser Trp Gly Asp Gly Cys 

195 200 205 

Ala Gin Lys Asn Lys Pro Gly Val Tyr Thr Lys Val Tyr Asn Tyr Val 



so 



55 



210 215 220 

Lys Trp He Lys Asn Thr He Ala Ala Asn Ser 
225 230 235 



<210>3 
<211>235 
<212> PRT 



34 



10 



IS 



EP 1 633 865 B1 

<213> artificial sequence 
<220> 

<223> trypsin variant 2 
<400> 3 

lie Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Val Pro Tyr Gin Val 
15 10 15 

Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu lie Asn Glu 

20 25 30 

Gin Trp Val Val Ser Ala Gly His Cys Tyr Asn Gly Arg Asp Leu Glu 

35 40 45 

Lys Ser Arg lie Gin Val Arg Leu Gly Glu His Asn lie Glu Val Leu 

50 55 60 

Glu Gly Asn Glu Gin Phe lie Asn Ala Ala Lys lie lie Arg His Pro 
20 65 70 75 80 

Gin Tyr Asp Arg Lys Thr Leu Asn Asn Asp lie Met Leu lie Lys Leu 

85 90 95 

Ser Ser Arg Ala Val lie Asn Ala Arg Val Ser Thr He Ser Leu Pro 

100 105 110 

Thr Ala Pro Pro Ala Thr Gly Thr Lys Cys Leu He Ser Gly Trp Gly 

115 120 125 

Asn Val Arg Gly Thr Trp Thr Ala Ser Ser Gly Ala Asp Tyr Pro Asp 

130 135 140 

Glu Leu Gin Cys Leu Asp Ala Pro Val Leu Ser Gin Ala Lys Cys Glu 
145 150 155 160 

35 Ala Ser Tyr Pro Gly Lys He Thr Ser Asn Met Phe Cys Val Gly Phe 

165 170 175 

Leu Glu Gly Gly Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Val 

180 185 190 

Val Cys Asn Gly Gin Leu Gin Gly Val Val Ser Trp Gly Asp Gly Cys 

195 200 205 

Ala Gin Lys Asn Lys Pro Gly Val Tyr Thr Lys Val Tyr Asn Tyr Val 

210 215 220 

Lys Trp He Lys Asn Thr He Ala Ala Asn Ser 



25 



30 



40 



45 



50 225 230 235 



<210>4 
<211>235 
SS <212> PRT 

<213> artificial sequence 
<220> 

<223> trypsin variant 3 
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<400>4 

He Val Gly Gly 
1 

Sec Leu Asn Ser 

20 

Gin Trp Val Val 
35 

Lys Ser Arg He 
50 

Glu Gly Asn Glu 
65 

Gin Tyr Asp Arg 

Ser Ser Arg Ala 

100 

Thr Ala Pro Pro 
115 

Asn Arg Lys Asp 
130 

Glu Leu Gin Cys 
145 

Ala Ser Tyr Pro 

Leu Glu Gly Gly 

180 

Val Cya Asn Gly 
195 

Ala Gin Lys Asn 
210 

Lys Trp He Lys 
225 



Tyr Asn Cys Glu 
5 

Gly Tyr His Phe 

Ser Ala Gly His 

40 

Gin Val Arg Leu 

55 

Gin Phe He Asn 
70 

Lys Thr Leu Asn 
85 

Val He Asn Ala 

Ala Thr Gly Thr 

120 

Phe Trp Thr Ala 

135 

Leu Asp Ala Pro 
150 

Gly Lys He Thr 
165 

Lys Asp Ser Cys 

Gin Leu Gin Gly 

200 

Lys Pro Gly Val 
215 

Asn Thr He Ala 
230 



Glu Asn Ser Val 
10 

Cys Gly Gly Ser 
25 

Cys Tyr Ala Ala 

Gly Glu His Asn 

60 

Ala Ala Lys He 
75 

Asn Asp He Met 
90 

Arg Val Ser Thr 
105 

Lys Cys Leu He 

Ser Ser Gly Ala 

140 

Val Leu Ser Gin 
155 

Ser Asn Met Phe 
170 

Gin Gly Asp Ser 
185 

Val Val Ser Trp 

Tyr Thr Lys Val 

220 

Ala Asn Ser 
235 



Pro Tyr Gin Val 
15 

Leu He Asn Glu 
30 

Thr Asn Gly Asp 
45 

He Glu Val Leu 

He Arg His Pro 

80 

Leu He Lys Leu 
95 

He Ser Leu Pro 
110 

Ser Gly Trp Gly 
125 

Asp Tyr Pro Asp 

Ala Lys Cys Glu 

160 

Cys Val Gly Phe 
175 

Gly Gly Pro Val 
190 

Gly Asp Gly Cys 
205 

Tyr Asn Tyr Val 



<210> 5 

<211>259 

<212> PRT 

<213> Homo sapiens 

<400>5 
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10 



IS 



20 



25 



30 



35 



40 



lie Val GXu Gly 
1 

Met Leu Phe Arg 

20 

He Ser Asp Arg 

Pro Trp Asp Lys 

50 

Lys His Ser Arg 
65 

Leu Glu Lys He 



Ser Asp 
5 

Lys Ser 

Trp Val 

Asn Phe 

Thr Arg 

70 
Tyr He 
85 

Ala Leu 



Asp Arg Asp He 

100 

Asp Tyr He His Pro Val 
115 

Leu Leu Gin Ala Gly Tyr 
130 

Lys Glu Thr Trp 
145 

Gin Val Val Asn 



Thr Ala 
150 
Leu Pro 
165 

He Thr 



Thr Arg He Arg 

180 

Asp Glu Gly Lys Arg Gly 
195 

Phe Val Met Lys 
210 

Val Ser Trp Gly 
225 

Thr His Val Phe 



Ser Pro 



Glu Gly 
230 
Arg Leu 
245 



Ala Glu 

Pro Gin 

Leu Thr 

40 
Thr Glu 
55 

Tyr Glu 

His Pro 

Met Lys 

Cys Leu 
120 
Lys Gly 
135 

Asn Val 

He Val 

Asp Asn 

Asp Ala 
200 
Phe Asn 
215 

Cys Asp 
Lys Lys 



He Gly Met 
10 

Glu Leu Leu 
25 

Ala Ala His 

Asn Asp Leu 

Arg Asn He 
75 

Arg Tyr Asn 
90 

Leu Lys Lys 
105 

Pro Asp Arg 

Arg Val Thr 

Gly Lys Gly 
155 

Glu Arg Pro 

170 
Met Phe Cys 
185 

Cys Glu Gly 



Ser Pro Trp 

Cys Gly Ala 
30 

Cys Leu Leu 
45 

Leu Val Arg 
60 

Glu Lys He 

Trp Arg Glu 

Pro Val Ala 
110 

Glu Thr Ala 

125 
Gly Trp Gly 
140 

Gin Pro Ser 

Val Cys Lys 



Gin Val 
15 

Ser Leu 

Tyr Pro 

He Gly 

Ser Met 

80 
Asn Leu 
95 

Phe Ser 

Ala Ser 

Asn Leu 

Val Leu 
160 
Asp Ser 
175 

Lys Pro 



Ala Gly Tyr 
190 

Asp Ser Gly Gly Pro 
205 

Asn Arg Trp Tyr Gin Met Gly He 

220 

Lys Tyr Gly 



Arg Asp Gly 
235 

Trp He Gin 
250 



Lys Val He 



Phe Tyr 
240 
Asp Gin 
255 



45 



Phe Gly Glu 



50 <210>6 

<211>235 

<212> PRT 

<213> Homo sapiens 

55 <400> 6 
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lie Val Gly Gly Ser Asn Ala Lys Glu Gly Ala Trp Pro Trp Val Val 
15 10 15 

^ Gly Leu Tyr Tyr Gly Gly Arg Leu Leu Cys Gly Ala Ser Leu Val Ser 

20 25 30 

Ser Asp Trp Leu Val Ser Ala Ala His Cys Val Tyr Gly Arg Asn Leu 
35 40 45 

10 Glu Pro Ser Lys Trp Thr Ala He Leu Gly Leu His Met Lys Ser Asn 

50 55 60 

Leu Thr Ser Pro Gin Thr Val Pro Arg Leu He Asp Glu He Val He 
65 70 IS 80 

Asn Pro His Tyr Asn Arg Arg Arg Lys Asp Asn Asp He Ala Met Met 

85 90 95 

His Leu Glu Phe Lys Val Asn Tyr Thr Asp Tyr He Gin Pro He Cys 

100 105 110 

Leu Pro Glu Glu Asn Gin Val Phe Pro Pro Gly Arg Asn Cys Ser He 

115 120 125 

Ala Gly Trp Gly Thr Val Val Tyr Gin Gly Thr Thr Ala Asn He Leu 
25 130 135 140 

Gin Glu Ala Asp Val Pro Leu Leu Ser Asn Glu Arg Cys Gin Gin Gin 
145 150 155 160 

Met Pro Glu Tyr Asn He Thr Glu Asn Met He Cys Ala Gly Tyr Glu 

165 170 175 

Glu Gly Gly He Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Leu Met 

180 185 190 

Cys Gin Glu Asn Asn Arg Trp Phe Leu Ala Gly Val Thr Ser Phe Gly 

195 200 205 

Tyr Lys Cys Ala Leu Pro Asn Arg Pro Gly Val Tyr Ala Arg Val Ser 
210 215 220 

40 Arg Phe Thr Glu Trp He Gin Ser Phe Leu His 

225 230 235 



<210>7 
45 <211>275 
<212> PRT 
<213> Bacillus subtilis 



30 



35 



50 



<400>7 



55 
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10 



15 



20 



He Ala His Glu Tyr AXa Gin Ser Val Pro Tyc Gly He Ser Gin He 
15 10 15 

Lys Ala Pro Ala Leu His Ser Gin Gly Tyr Thr Gly Ser Asn Val Lys 

20 25 30 

Val Ala Val lie Asp Ser Gly He Asp Ser Ser His Pro Asp Leu Asn 

35 40 45 

Val Arg Gly Gly Ala Ser Phe Val Pro Ser Glu Thr Asn Pro Tyr Gin 

50 55 60 

Asp Gly Ser Ser His Gly Thr His Val Ala Gly Thr He Ala Ala Leu 
65 70 75 80 

Asn Asn Ser He Gly Val Leu Gly Val Ser Pro Ser Ala Ser Leu Tyr 

85 90 95 

Ala Val Lys Val Leu Asp Ser Thr Gly Ser Gly Gin Tyr Ser Trp He 

100 105 110 

He Asn Gly He Glu Trp Ala He Ser Asn Asn Met Asp Val He Asn 

115 120 125 

Met Ser Leu Gly Gly Pro Thr Gly Ser Thr Ala Leu Lys Thr Val Val 
25 130 135 140 

Asp Lys Ala Val Ser Ser Gly He Val Val Ala Ala Ala Ala Gly Asn 
145 150 155 160 

Glu Gly Ser Ser Gly Ser Thr Ser Thr Val Gly Tyr Pro Ala Lys Tyr 

165 170 175 

Pro Ser Thr He Ala Val Gly Ala Val Asn Ser Ser Asn Gin Arg Ala 

180 185 190 

Ser Phe Ser Ser Ala Gly Ser Glu Leu Asp Val Met Ala Pro Gly Val 

195 200 205 

Ser He Gin Ser Thr Leu Pro Gly Gly Thr Tyr Gly Ala Tyr Asn Gly 
210 215 220 

40 Thr Ser Met Ala Thr Pro His Val Ala Gly Ala Ala Ala Leu He Leu 

225 230 235 240 

Ser Lys His Pro Thr Trp Thr Asn Ala Gin Val Arg Asp Arg Leu Glu 

245 250 255 

^ Ser Thr Ala Thr Tyr Leu Gly Asn Ser Phe Tyr Tyr Gly Lys Gly Leu 

260 265 270 

He Asn Val 
275 

SO 

<210>8 
<211>320 
<212> PRT 
55 <213> Murinae gen. sp. 

<400>8 
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Val Ala Lys Arg Arg Ala Lys Arg Asp Val Tyr Gin Glu Pro Thr Asp 
15 10 15 

Pro Lys Phe Pro Gin Gin Trp Tyr Leu Ser Gly Val Thr Gin Arg Asp 

20 25 30 

Leu Asn Val Lys Glu Ala Trp Ala Gin Gly Phe Thr Gly His Gly He 

35 40 45 

Val Val Ser lie Leu Asp Asp Gly lie Glu Lys Asn His Pro Asp Leu 

50 55 60 

Ala Gly Asn Tyr Asp Pro Gly Ala Ser Phe Asp Val Asn Asp Gin Asp 
65 70 75 80 

15 Pro Asp Pro Gin Pro Arg Tyr Thr Gin Met Asn Asp Asn Arg His Gly 

85 90 95 

Thr Arg Cys Ala Gly Glu Val Ala Ala Val Ala Asn Asn Gly Val Cys 

100 105 110 

Gly Val Gly Val Ala Tyr Asn Ala Arg He Gly Gly Val Arg Met Leu 

115 120 125 

Asp Gly Glu Val Thr Asp Ala Val Glu Ala Arg Ser Leu Gly Leu Asn 

130 135 140 

Pro Asn His He His He Tyr Ser Ala Ser Trp Gly Pro Glu Asp Asp 
145 150 155 160 

Gly Lys Thr Val Asp Gly Pro Ala Arg Leu Ala Glu Glu Ala Phe Phe 

165 170 175 

Arg Gly Val Ser Gin Gly Arg Gly Gly Leu Gly Ser He Phe Val Trp 

180 185 190 

Ala Ser Gly Asn Gly Gly Arg Glu His Asp Ser Cys Asn Cys Asp Gly 
35 195 200 205 

Tyr Thr Asn Ser He Tyr Thr Leu Ser He Ser Ser Ala Thr Gin Phe 

210 215 220 

Gly Asn Val Pro Trp Tyr Ser Glu Ala Cys Ser Ser Thr Leu Ala Thr 
225 230 235 240 

Thr Tyr Ser Ser Gly Asn Gin Asn Glu Lys Gin He Val Thr Thr Asp 

245 250 255 

Leu Arg Gin Lys Cys Thr Glu Ser His Thr Gly Thr Ser Ala Ser Ala 

260 265 270 

Pro Leu Ala Ala Gly He He Ala Leu Thr Leu Glu Ala Asn Lys Asn 
275 280 285 

so Leu Thr Trp Arg Asp Met Gin His Leu Val Val Gin Thr Ser Lys Pro 
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20 



2S 



30 



35 



40 



290 295 300 

Ala His Leu Asn Ala Asp Asp Trp Ala Thr Asn Gly Val Gly Arg Lys 
305 310 315 320 



<210> 9 
<211>330 
10 <212>PRT 

<213> Homo sapiens 

<400>9 



IS 



45 



SO 



SS 
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Glu Lys Glu Arg 
1 

Leu Phe Asn Asp 

20 

Arg Met Thr Ala 
35 

Trp Gin Lys Gly 
50 

Asp Gly Leu Glu 
€5 

Glu Ala Ser Tyr 

Tyr Asp Pro Thr 

100 

lie Ala Met Gin 

115 

Asn Ser Lys Val 
130 

Ala He Glu Ala 
145 

Tyr Ser Ala Ser 

Pro Gly Arg Leu 

180 

Arg Gin Gly Lys 
195 

Arg Gin Gly Asp 
210 

Thr He Ser He 
225 

Ala Glu Lys Cys 



Ser Lys Arg Ser 
5 

Pro Met Trp Asn 

Ala Leu Pro Lys 

40 

He Thr Gly Lys 
55 

Trp Asn His Thr 
70 

Asp Phe Asn Asp 
85 

Asn Glu Asn Lys 

Ala Asn Asn His 

120 

Gly Gly He Arg 

135 

Ser Ser He Gly 
150 

Trp Gly Pro Asn 
165 

Ala Gin Lys Ala 

Gly Ser He Phe 

200 

Asn Cys Asp Cys 
215 

Ser Ser Ala Ser 
230 

Ser Ser Thr Leu 



Ala Leu Arg Asp 
10 

Gin Gin Trp Tyr 
25 

Leu Asp Leu His 

Gly Val Val He 

60 

Asp He Tyr Ala 
75 

Asn Asp His Asp 
90 

His Gly Thr Arg 
105 

Lys Cys Gly Val 

Met Leu Asp Gly 

140 

Phe Asn Pro Gly 
155 

Asp Asp Gly Lys 
170 

Phe Glu Tyr Gly 
185 

Val Trp Ala Ser 

Asp Gly Tyr Thr 

220 

Gin Gin Gly Leu 

235 

Ala Thr Ser Tyr 



Ser Ala Leu Asn 
15 

Leu Gin Asp Thr 
30 

Val He Pro Val 
45 

Thr Val Leu Asp 

Asn Tyr Asp Pro 

80 

Pro Phe Pro Arg 
95 

Cys Ala Gly Glu 

110 

Gly Val Ala Tyr 

125 

He Val Thr Asp 

His Val Asp He 

160 

Thr Val Glu Gly 

175 

Val Lys Gin Gly 
190 

Gly Asn Gly Gly 
205 

Asp Ser He Tyr 

Ser Pro Trp Tyr 

240 

Ser Ser Gly Asp 
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10 



15 



40 



45 



245 250 255 

Tyr Thr Asp Gin Arg He Thr Ser Ala Asp Leu His Asn Asp Cys Thr 

260 265 270 

GIu Thr His Thr Gly Thr Ser Ala Ser Ala Pro Leu Ala Ala Gly He 

275 280 285 

Phe Ala Leu Ala Leu Glu Ala Asn Pro Asn Leu Thr Trp Arg Asp Met 

290 295 300 

Gin His Leu Val Val Trp Thr Ser Glu Tyr Asp Pro Leu Ala Asn Asn 
305 310 315 320 

Pro Gly Trp Lys Lys Asn Gly Ala Gly Leu 

325 330 



<210> 10 
<211>297 
20 <212> PRT 

<213> Homo sapiens 

<400> 10 

25 

Asn Thr His Pro Cys Gin Ser Asp Met Asn He Glu Gly Ala Trp Lys 
1 5 . 10 15 

Arg Gly Tyr Thr Gly Lys Asn He Val Val Thr He Leu Asp Asp Gly 
30 20 25 30 

He Glu Arg Thr His Pro Asp Leu Met Gin Asn Tyr Asp Ala Leu Ala 

35 40 45 

Ser Cys Asp Val Asn Gly Asn Asp Leu Asp Pro Met Pro Arg Tyr Asp 
35 50 55 60 

Ala Ser Asn Glu Asn Lys His Gly Thr Arg Cys Ala Gly Glu Val Ala 
65 70 75 80 

Ala Ala Ala Asn Asn Ser His Cys Thr Val Gly He Ala Phe Asn Ala 

85 90 95 

Lys He Gly Gly Val Arg Met Leu Asp Gly Asp Val Thr Asp Met Val 

100 105 110 

Glu Ala Lys Ser Val Ser Phe Asn Pro Gin His Val His He Tyr Ser 

115 120 125 

Ala Ser Trp Gly Pro Asp Asp Asp Gly Lys Thr Val Asp Gly Pro Ala 
130 135 140 

50 Pro Leu Thr Arg Gin Ala Phe Glu Asn Gly Val Arg Met Gly Arg Arg 

145 150 155 160 

Gly Leu Gly Ser Val Phe Val Trp Ala Ser Gly Asn Gly Gly Arg Ser 

165 170 175 

Lys Asp His Cys Ser Cys Asp Gly Tyr Thr Asn Ser He Tyr Thr He 



55 



43 



EP 1 633 865 B1 



180 

Ser lie Ser Ser 
195 

Glu Cys Ser Ser 
210 

Asp Lys Lys lie 
225 

His Thr Gly Thr 

Leu Ala Leu Glu 

260 

Val lie Val Arg 
275 

Lys Thr Asn Ala 
290 



Thr Ala Glu Ser 

200 

Thr Leu Ala Thr 
215 

lie Thr Thr Asp 
230 

Ser Ala Ser Ala 
245 

Ala Asn Pro Phe 

Thr Ser Arg Ala 

280 

Ala Gly Phe Lys 
295 



185 

Gly Lys Lys Pro 

Thr Tyr Ser Ser 

220 

Leu Arg Gin Arg 

235 

Pro Met Ala Ala 

250 

Leu Thr Trp Arg 

265 

Gly His Leu Asn 
Val 



190 

Trp Tyr Leu Glu 

205 

Gly Glu Ser Tyr 

Cys Thr Asp Asn 

240 

Gly He He Ala 
255 

Asp Val Gin His 
270 

Ala Asn Asp Trp 
285 



<210> 11 

<211>328 

<212> PRT 

<21 3> Homo sapiens 

<400> 1 1 



44 
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10 



15 



20 



25 



Thr Leu VaX Asp Glu Gin Pro Leu Glu Asn 
15 10 
Phe Gly Thr lie Gly He Gly Thr Pro Ala 

20 25 
Phe Asp Thr Gly Ser Ser Asn Leu Trp Val 

35 40 
Ser Leu Ala Cys Thr Asn His Asn Arg Phe 

50 55 
Thr Tyr Gin Ser Thr Ser Glu Thr Val Ser 
€5 70 

Ser Met Thr Gly He Leu Gly Tyr Asp Thr 

85 90 
Ser Asp Thr Asn Gin He Phe Gly Leu Ser 

100 105 
Phe Leu Tyr Tyr Ala Pro Phe Asp Gly He 

115 120 
Ser He Ser Ser Ser Gly Ala Thr Pro Val 

130 135 
Gin Gly Leu Val Ser Gin Asp Leu Phe Ser 



Tyr Leu Asp Met 



Glu Tyr 
15 

Val Val 



Gin Asp Phe Thr 

30 

Pro Ser Val Tyr Cys Ser 
45 

Asn Pro Glu Asp 
60 

He Thr Tyr Gly 
75 

Val Gin Val Gly 



Ser Ser 



Thr Gly 

80 
Gly He 

95 

Gly Ser 



Glu Thr Glu Pro 

110 

Leu Gly Leu Ala Tyr Pro 
125 

Phe Asp Asn He Trp Asn 
140 

Val Tyr Leu Ser Ala Asp 
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35 



40 



45 
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10 



20 



25 



145 150 155 160 

Asp Lys Ser Gly Ser Val Val lie Phe Gly Gly He Asp Ser Ser Tyr 

165 170 175 

Tyr Thr Gly Ser Leu Asn Trp Val Pro Val Thr Val Glu Gly Tyr Trp 

180 185 190 

Gin He Thr Val Asp Ser He Thr Met Asn Gly Glu Thr He Ala Cys 

.195 200 205 

Ala Glu Gly Cys Gin Ala He Val Asp Thr Gly Thr Ser Leu Leu Thr 
210 215 220 

15 Gly Pro Thr Ser Pro He Ala Asn He Gin Ser Asp He Gly Ala Ser 

225 230 235 240 

Glu Asn Ser Asp Gly Asp Met Val Val Ser Cys Ser Ala He Ser Ser 

245 250 255 

Leu Pro Asp He Val Phe Thr He Asn Gly Val Gin Tyr Pro Val Pro 

260 265 270 

Pro Ser Ala Tyr He Leu Gin Ser Glu Gly Ser Cys He Ser Gly Phe 

275 280 285 

Gin Gly Met Asn Val Pro Thr Glu Ser Gly Glu Leu Trp He Leu Gly 

290 295 300 

Asp Val Phe He Arg Gin Tyr Phe Thr Val Phe Asp Arg Ala Asn Asn 
305 310 315 320 

Gin Val Gly Leu Ala Pro Val Ala 

325 

35 <210> 12 

<211>358 
<212> PRT 
<213> Homo sapiens 

40 <400> 12 

Glu Met Val Asp Asn Leu Arg Gly Lys Ser Gly Gin Gly Tyr Tyr Val 
15 10 15 

Glu Met Thr Val Gly Ser Pro Pro Gin Thr Leu Asn He Leu Val Asp 

20 25 30 

Thr Gly Ser Ser Asn Phe Ala Val Gly Ala Ala Pro His Pro Phe Leu 

35 40 45 

His Arg Tyr Tyr Gin Arg Gin Leu Ser Ser Thr Tyr Arg Asp Leu Arg 

50 55 60 

Lys Gly Val Tyr Val Pro Tyr Thr Gin Gly Lys Trp Glu Gly Glu Leu 
55 65 70 75 80 

Gly Thr Asp Leu Val Ser He Pro His Gly Pro Asn Val Thr Val Arg 
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Ala Asn lie Ala 

100 

Ser Asn Trp Glu 
115 

Pro Asp Asp Ser 
130 

His Val Pro Asn 
145 

Leu Asn Gin Ser 

Gly Gly lie Asp 

180 

lie Arg Arg Glu 
195 

Asn Gly Gin Asp 
210 

Ser lie Val Asp 
225 

Phe Glu Ala Ala 

Phe Pro Asp Gly 

260 

Gly Thr Thr Pro 

275 

Gly Glu Val Thr 
290 

Tyr Leu Arg Pro 
305 

Lys Phe Ala lie 

lie Met Glu Gly 

340 

Gly Phe Ala Val 
355 



85 

Ala lie Thr Glu 

Gly He Leu Gly 

120 

Leu Glu Pro Phe 
135 

Leu Phe Ser Leu 

150 

Glu Val Leu Ala 
165 

His Ser Leu Tyr 

Trp Tyr Tyr Glu 

200 

Leu Lys Met Asp 
215 

Ser Gly Thr Thr 
230 

Val Lys Ser He 
245 

Phe Trp Leu Gly 

Trp Asn He Phe 

280 

Asn Gin Ser Phe 
295 

Val Glu Asp val 
310 

Ser Gin Ser Ser 
325 

Phe Tyr Val Val 
Ser Ala 



90 

Ser Asp Lys Phe 
105 

Leu Ala Tyr Ala 

Phe Asp Ser Leu 

140 

Gin Leu Cys Gly 
155 

Ser Val Gly Gly 
170 

Thr Gly Ser Leu 
185 

Val He He Val 

Cys Lys Glu Tyr 

220 

Asn Leu Arg Leu 
235 

Lys Ala Ala Ser 
250 

Glu Gin Leu Val 
265 

Pro Val He Ser 

Arg He Thr He 

300 

Ala Thr Ser Gin 
315 

Thr Gly Thr Val 
330 

Phe Asp Arg Ala 
345 



95 

Phe He Asn Gly 
110 

Glu He Ala Arg 
125 

Val Lys Gin Thr 

Ala Gly Phe Pro 

160 

Ser Met He He 
175 

Trp Tyr Thr Pro 
190 

Arg Val Glu He 
205 

Asn Tyr Asp Lys 

Pro Lys Lys Val 

240 

Ser Thr Glu Lys 
255 

Cys Trp Gin Ala 
270 

Leu Tyr Leu Met 
285 

Leu Pro Gin Gin 

Asp Asp Cys Tyr 

320 

Met Gly Ala Val 
335 

Arg Lys Arg He 
350 



<210> 13 

<211>351 

<212> PRT 

<213> Homo sapiens 

<400> 13 



47 
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325 330 335 

Val Phe Asp Arg Asp Asn Asn Arg Vai Gly Phe Ala GIu Ala Ala 

340 345 350 



25 



30 



<210> 14 
<211>305 
10 <212>PRT 

<213> Homo sapiens 

<400> 14 

IS 

Met Leu Glu Ala Asp Asp Gin Gly Cys He Glu Glu Gin Gly Val Glu 
15 10 15 

Asp Ser Ala Asn Glu Asp Ser Val Asp Ala Lys Pro Asp Arg Ser Ser 
20 20 25 30 

Phe Val Pro Ser Leu Phe Ser Lys Lys Lys Lys Asn Val Thr Met Arg 

35 40 45 

Ser He Lys Thr Thr Arg Asp Arg Val Pro Thr Tyr Gin Tyr Asn Met 

50 55 60 

Asn Phe Glu Lys Leu Gly Lys Cys He He He Asn Asn Lys Asn Phe 
65 70 75 80 

Asp Lys Val Thr Gly Met Gly Val Arg Asn Gly Thr Asp Lys Asp Ala 

85 90 95 

Glu Ala Leu Phe Lys Cys Phe Arg Ser Leu Gly Phe Asp Val He Val 

100 105 110 

Tyr Asn Asp Cys Ser Cys Ala Lys Met Gin Asp Leu Leu Lys Lys Ala 

115 120 125 

Ser Glu Glu Asp His Thr Asn Ala Ala Cys Phe Ala Cys He Leu Leu 

130 135 140 

Ser His Gly Glu Glu Asn Val He Tyr Gly Lys Asp Gly Val Thr Pro 
145 150 155 160 

He Lys Asp Leu Thr Ala His Phe Arg Gly Asp Arg Ser Lys Thr Leu 

165 170 175 

Leu Glu Lys Pro Lys Leu Phe Phe He Gin Ala Cys Arg Gly Thr Glu 

180 185 190 

Leu Asp Asp Gly He Gin Ala Asp Ser Gly Pro He Asn Asp Thr Asp 

195 200 205 

Ala Asn Pro Arg Tyr Lys He Pro Val Glu Ala Asp Phe Leu Phe Ala 

210 215 220 

Tyr Ser Thr Val Pro Gly Tyr Tyr Ser Trp Arg Ser Pro Gly Arg Gly 
55 225 230 235 240 

Ser Trp Phe Val Gin Ala Leu Cys Ser He Leu Glu Glu His Gly Lys 
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SO 
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10 



IS 



20 



30 



35 



4S 



SO 



245 250 255 

Asp Leu Glu lie Met Gin lie Leu Thr Arg Val Asn Asp Arg Val Ala 

260 265 270 

Arg His Phe Glu Ser Gin Ser Asp Asp Pro His Fhe His Glu Lys Lys 

275 280 285 

Gin lie Pro Cys Val Val Ser Met Leu Thr Lys Glu Leu Tyr Phe Ser 
290 295 300 

Gin 
305 



<210> 15 
<211>262 
<212> PRT 

<213> Streptomyces sp. K15 



<400> 15 



Val Thr Lys Pro Thr lie Ala Ala Val Gly Gly Tyr Ala Met Asn Asn 
^^15 10 15 

Gly Thr Gly Thr Thr Leu Tyr Thr Lys Ala Ala Asp Thr Arg Arg Ser 

20 25 30 

Thr Gly Ser Thr Thr Lys He Met Thr Ala Lys Val Val Leu Ala Gin 

35 40 45 

Ser Asn Leu Asn Leu Asp Ala Lys Val Thr He Gin Lys Ala Tyr Ser 

50 55 60 

Asp Tyr Val Val Ala Asn Asn Ala Ser Gin Ala His Leu He Val Gly 
65 70 75 80 

Asp Lys Val Thr Val Arg Gin Leu Leu Tyr Gly Leu Met Leu Pro Ser 

85 90 95 

^ Gly Cys Asp Ala Ala Tyr Ala Leu Ala Asp Lys Tyr Gly Ser Gly Ser 

100 105 110 

Thr Arg Ala Ala Arg Val Lys Ser Phe He Gly Lys Met Asn Thr Ala 

115 120 125 

Ala Thr Asn Leu Gly Leu His Asn Thr His Phe Asp Ser Phe Asp Gly 

130 135 140 

He Gly Asn Gly Ala Asn Tyr Ser Thr Pro Arg Asp Leu Thr Lys He 
145 150 155 160 

Ala Ser Ser Ala Met Lys Asn Ser Thr Phe Arg Thr Val Val Lys Thr 

165 170 175 

Lys Ala Tyr Thr Ala Lys Thr Val Thr Lys Thr Gly Ser He Arg Thr 
SS 180 185 190 

Met Asp Thr Trp Lys Asn Thr Asn Gly Leu Leu Ser Ser Tyr Ser Gly 
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10 



15 



20 



25 



35 



40 



SO 



195 200 205 

Ala lie Gly Val Lys Thr Gly Ser Gly Pro Glu Ala Lys Tyr Cys Leu 

210 215 220 

Val Phe Ala Ala Thr Arg Gly Gly Lys Thr Val He Gly Thr Val Leu 
225 230 235 240 

Ala Ser Thr Ser He Pro Ala Arg Glu Ser Asp Ala Thr Lys He Met 

245 250 255 

Asn Tyr Gly Phe Ala Leu 

260 



<210> 16 
<211>256 
<212> PRT 

<213> Human cytomegalovims 
<400> 1 6 



Met Thr Met Asp Glu Gin Gin Ser Gin Ala Val Ala Pro Val Tyr Val 
15 10 15 

Gly Gly Phe Leu Ala Arg Tyr Asp Gin Ser Pro Asp Glu Ala Glu Leu 

20 25 30 

Leu Leu Pro Arg Asp Val Val Glu His Trp Leu His Ala Gin Gly Gin 
^ 35 40 45 

Gly Gin Pro Ser Leu Ser Val Ala Leu Pro Leu Asn He Asn His Asp 

50 55 60 

Asp Thr Ala Val Val Gly His Val Ala Ala Met Gin Ser Val Arg Asp 
65 70 75 80 

Gly Leu Phe Cys Leu Gly Cys Val Thr Ser Pro Arg Phe Leu Glu He 

85 90 95 

Val Arg Arg Ala Ser Glu Lys Ser Glu Leu Val Ser Arg Gly Pro Val 

100 105 110 

Ser Pro Leu Gin Pro Asp Lys Val Val Glu Phe Leu Ser Gly Ser Tyr. 
115 120 125 

45 Ala Gly Leu Ser Leu Ser Ser Arg Arg Cys Asp Asp Val Glu Gin Ala 

130 135 140 

Thr Ser Leu Ser Gly Ser Glu Thr Thr Pro Phe Lys His Val Ala Leu 
145 150 155 160 

Cys Ser Val Gly Arg Arg Arg Gly Thr Leu Ala Val Tyr Gly Arg Asp 

165 170 175 

Pro Glu Trp Val Thr Gin Arg Phe Pro Asp Leu Thr Ala Ala Asp Arg 

180 185 190 

Asp Gly Leu Arg Ala Gin Trp Gin Arg Cys Gly Ser Thr Ala Val Asp 
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195 200 205 

Ala Ser Gly Asp Pro Phe Arg Ser Asp Ser Tyr Gly Leu Leu Gly Asn 

210 215 220 

Ser Val Asp Ala Leu Tyr lie Arg Glu Arg Leu Pro Lys Leu Arg Tyr 
225 230 235 240 

Asp Lys Gin Leu Val Gly Val Thr Glu Arg Glu Ser Tyr Val Lys Ala 

245 250 255 



<210> 17 
<211>248 
IS <212> PRT 

<213> Escherichia coli 

<400> 17 

20 

Val Arg Ser Phe lie Tyr Glu Pro Phe Gin lie Pro Ser Gly Ser Met 
15 10 15 

Met Pro Thr Leu Leu lie Gly Asp Phe lie Leu Val Glu Lys Phe Ala 
2s 20 25 30 

Tyr Gly lie Lys Asp Pro lie Tyr Gin Lys Thr Leu lie Glu Thr Gly 

35 40 45 

His Pro Lys Arg Gly Asp lie Val Val Phe Lys Tyr Pro Glu Asp Pro 
30 50 55 60 

Lys Leu Asp Tyr lie Lys Arg Ala Val Gly Leu Pro Gly Asp Lys Val 
65 70 75 80 

Thr Tyr Asp Pro Val Ser Lys Glu Leu Thr lie Gin Pro Gly Cys Ser 

85 90 95 

Ser Gly Gin Ala Cys Glu Asn Ala Leu Pro Val Thr Tyr Ser Asn Val 

100 105 110 

Glu Pro Ser Asp Phe Val Gin Thr Phe Ser Arg Arg Asn Gly Gly Glu 

115 120 125 

Ala Thr Ser Gly Phe Phe Glu Val Pro Lys Asn Glu Thr Lys Glu Asn 
130 135 140 

45 Gly lie Arg Leu Ser Glu Arg Lys Glu Thr Leu Gly Asp Val Thr His 

145 150 155 160 

Arg lie Leu Thr Val Pro lie Ala Gin Asp Gin Val Gly Met Tyr Tyr 

165 170 175 

Gin Gin Pro Gly Gin Gin Leu Ala Thr Trp lie Val Pro Pro Gly Gin 

180 185 190 

Tyr Phe Met Met Gly Asp Asn Arg Asp Asn Ser Ala Asp Ser Arg Tyr 

195 200 205 

Trp Gly Phe Val Pro Glu Ala Asn Leu Val Gly Arg Ala Thr Ala lie 



3S 



40 



so 



55 



52 
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25 
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210 215 220 

Trp Met Set Phe Asp Lys Gin Glu Gly Glu Trp Pro Thr Gly Leu Arg 
5 225 230 235 240 

Leu Ser Arg lie Gly Gly lie His 

245 

10 <210> 18 

<211>317 
<212> PRT 

<213> Serratia marcescens 
15 <400> 18 

Met Glu Gin Leu Arg Gly Leu Tyr Pro Pro Leu Ala Ala Tyr Asp Ser 
15 10 15 

Gly Trp Leu Asp Thr Gly Asp Gly His Arg lie Tyr Trp Glu Leu Ser 

20 25 30 

Gly Asn Pro Asn Gly Lys Pro Ala Val Phe lie His Gly Gly Pro Gly 

35 40 45 

Gly Gly lie Ser Pro His His Arg Gin Leu Phe Asp Pro Glu Arg Tyr 

50 55 60 

Lys Val Leu Leu Phe Asp Gin Arg Gly Cys Gly Arg Ser Arg Pro His 
30 65 70 75 80 

Ala Ser Leu Asp Asn Asn Thr Thr Trp His Leu Val Ala Asp lie Glu 

85 90 95 

Arg Leu Arg Glu Met Ala Gly Val Glu Gin Trp Leu Val Phe Gly Gly 

100 105 110 

Ser Trp Gly Ser Thr Leu Ala Leu Ala Tyr Ala Gin Thr His Pro Glu 

115 120 125 

Arg Val Ser Glu Met Val Leu Arg Gly lie Phe Thr Leu Arg Lys Gin 

130 135 140 

Arg Leu His Trp Tyr Tyr Gin Asp Gly Ala Ser Arg Phe Phe Pro Glu 
145 150 155 160 

45 Lys Trp Glu Arg Val Leu Ser He Leu Ser Asp Asp Glu Arg Lys Asp 

165 170 175 

Val He Ala Ala Tyr Arg Gin Arg Leu Thr Ser Ala Asp Pro Gin Val 

180 185 190 

Gin Leu Glu Ala Ala Lys Leu Trp Ser Val Trp Glu Gly Glu Thr Val 

195 200 205 

Thr Leu Leu Pro Ser Arg Glu Ser Ala Ser Phe Gly Glu Asp Asp Phe 

210 215 220 

Ala Leu Ala Phe Ala Arg He Glu Asn His Tyr Phe Thr His Leu Gly 
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50 
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10 



IS 



225 230 235 240 

Phe Leu GIu Ser Asp Asp Gin Leu Leu Arg Asn Val Pro Leu lie Arg 

245 250 255 

His lie Pro Ala Val lie Val His Gly Arg Tyr Asp Met Ala Cys Gin 

260 265 270 

Val Gin Asn Ala Trp Asp Leu Ala Lys Ala Trp Pro Glu Ala Glu Leu 

275 280 285 

His He Val Glu Gly Ala Gly His Ser Tyr Asp Glu Pro Gly lie Leu 

290 295 300 

His Gin Leu Met He Ala Thr Asp Arg Phe Ala Gly Lys 
305 310 315 



<210> 19 
20 <211>229 
<212> PRT 
<213> Escherichia coli 



25 



30 



40 



45 



SO 



<400> 19 



Met Glu Leu Leu Leu Leu Ser Asn Ser Thr Leu Pro Gly Lys Ala Trp 
15 10 15 

Leu Glu His Ala Leu Pro Leu He Ala Asn Gin Leu Asn Gly Arg Arg 

20 25 30 

Ser Ala Val Phe He Pro Phe Ala Gly Val Thr Gin Thr Trp Asp Glu 
35 40 45 

35 Tyr Thr Asp Lys Thr Ala Glu Val X«eu Ala Pro Leu Gly Val Asn Val 

50 55 60 

Thr Gly He His Arg Val Ala Asp Pro Leu Ala Ala He Glu Lys Ala 
65 70 75 80 

Glu He He He Val Gly Gly Gly Asn Thr Phe Gin Leu Leu Lys Glu 

85 90 95 

Ser Arg Glu Arg Gly Leu Leu Ala Pro Met Ala Asp Arg Val Lys Arg 

100 105 110 

Gly Ala Leu Tyr He Gly Trp Ser Ala Gly Ala Asn Leu Ala Cys Pro 

115 120 125 

Thr He Arg Thr Thr Asn Asp Met Pro He Val Asp Pro Asn Gly Phe 

130 135 140 

Asp Ala Leu Asp Leu Phe Pro Leu Gin He Asn Pro His Phe Thr Asn 
145 150 i55 160 

Ala Leu Pro Glu Gly His Lys Gly Glu Thr Arg Glu Gin Arg He Arg 
^ 165 170 175 

Glu Leu Leu Val Val Ala Pro Glu Leu Thr Val He Gly Leu Pro Glu 
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35 



45 



50 



180 185 190 

Gly Asn Trp tie Gin Val Ser Asn Gly Gin Ala Val Leu Gly Gly Pro 

195 200 205 

Asn Thr Thr Trp Val Phe Lys Ala Gly Glu Glu Ala Val Ala Leu Glu 

^ 210 215 220 

Ala Gly His Arg Phe 
225 



<210> 20 
<211>99 
IS <212> PRT 

<213> Human immunodeficiency vims 

<400> 20 

20 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys lie Gly 
15 10 15 

Gly Gin Leu Arg Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
2s 20 25 30 

Leu Glu Asp lie Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 

35 40 45 

Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu lie 
30 50 55 60 

Glu lie Cys Gly Lys Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

Pro Val Asn lie lie Gly Arg Asn Met Leu Thr Gin lie Gly Cys Thr 

85 90 95 

Leu Asn Phe 



<210> 21 
40 <211>297 
<212> PRT 
<213> Escherichia coli 



<400> 21 



Ser Thr Glu Thr Leu Ser Phe Thr Pro Asp Asn lie Asn Ala Asp lie 

15 10 15 

Ser Leu Gly Thr Leu Ser Gly Lys Thr Lys Glu Arg Val Tyr Leu Ala 

20 25 30 

Glu Glu Gly Gly Arg Lys Val Ser Gin Leu Asp Trp Lys Phe Asn Asn 
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10 



IS 



35 40 45 

AXa Ala lie IXe Lys Gly Ala He Asn Trp Asp Leu Met Pro Gin He 

50 55 ' 60 

Ser He Gly Ala Ala Gly Trp Thr Thr Leu Gly Ser Arg Gly Gly Asn 
65 70 75 80 

Met Val Asp Gin Asp Trp Met Asp Ser Ser Asn Pro Gly Thr Trp Thr 

85 , 90 95 

Asp Glu Ala Arg His Pro Asp Thr Gin Leu Asn Tyr Ala Asn Glu Phe 

100 105 110 

Asp Leu Asn He Lys Gly Trp Leu Leu Asn Glu Pro Asn Tyr Arg Leu 

115 120 125 

Gly Leu Met Ala Gly Tyr Gin Glu Ser Arg Tyr Ser Phe Thr Ala Arg 
130 135 140 

20 Gly Gly Ser Tyr He Tyr Ser Ser Glu Glu Gly Phe Arg Asp Asp He 

145 ISO 155 160 

Gly Ser Phe Pro Asn Gly Glu Arg Ala He Gly Tyr Lys Gin Arg Phe 

165 170 175 

Lys Met Pro Tyr He Gly Leu Thr Gly Ser Tyr Arg Tyr Glu Asp Phe 

180 185 190 

Glu Leu Gly Gly Thr Phe Lys Tyr Ser Gly Trp Val Glu Ser Ser Asp 

195 200 205 

Asn Asp Glu His Tyr Asp Pro Lys Gly Arg He Thr Tyr Arg Ser Lys 

210 215 220 

Val Lys Asp Gin Asn Tyr Tyr Ser Val Ala Val Asn Ala Gly Tyr Tyr 
225 230 235 240 

Val Thr Pro Asn Ala Lys Val Tyr Val Glu Gly Ala Trp Asn Arg Val 

245 250 255 

Thr Asn Lys Lys Gly Asn Thr Ser Leu Tyr Asp His Asn Asn Asn Thr 
40 260 265 270 

Ser Asp Tyr Ser Lys Asn Gly Ala Gly He Glu Asn Tyr Asn Phe He 

275 280 285 

Thr Thr Ala Gly Leu Lys Tyr Thr Phe 
290 295 
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<210> 22 
50 <211>212 
<212> PRT 
<213> Carica papaya 

<400> 22 

55 



56 
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IS 



20 



30 



35 



40 



lie Pro Glu Tyr Val Asp Trp Arg Gin Lys Gly Ala Val Thr Pro Val 



15 10 15 

Lys Asn Gin Gly Ser Cys Gly Ser Cys Trp Ala Phe Set Ala Val Val 
10 20 25 30 

Thr He Glu Gly He He Lys He Arg Thr Gly Asn Leu Asn Gin Tyr 

35 40 45 

Ser Glu Gin Glu Leu Leu Asp Cys Asp Arg Arg Ser Tyr Gly Cys Asn 

50 55 60 

Gly Gly Tyr Pro Trp Ser Ala Leu Gin Leu Val Ala Gin Tyr Gly He 
65 70 75 80 

His Tyr Arg Asn Thr Tyr Pro Tyr Glu Gly Val Gin Arg Tyr Cys Arg 

85 90 95 

Ser Arg Glu Lys Gly Pro Tyr Ala Ala Lys Thr Asp Gly Val Arg Gin 

100 105 110 

25 Val Gin Pro Tyr Asn Gin Gly Ala Leu Leu Tyr Ser He Ala Asn Gin 

115 120 125 

Pro Val Ser Val Val Leu Gin Ala Ala Gly Lys Asp Phe Gin I^u Tyr 

130 135 140 

Arg Gly Gly He Phe Val Gly Pro Cys Gly Asn Lys Val Asp His Ala 
145 150 155 160 

Val Ala Ala Val Gly Tyr Gly Pro Asn Tyr He Leu He Lys Asn Ser 

165 170 175 

Trp Gly Thr Gly Trp Gly Glu Asn Gly Tyr He Arg He Lys Arg Gly 

180 185 190 

Thr Gly Asn Ser Tyr Gly Val Cys Gly Leu Tyr Thr Ser Ser Phe Tyr 

195 200 205 

Pro Val Lys Asn 
210 

45 <210> 23 

<211>699 
<212> PRT 
<213> Homo sapiens 

50 <400> 23 
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Ala Gly He Ala 
1 

Leu Gly Ser His 

20 

Ala Leu Arg Asn 

35 

Ser Fhe Pro Ala 



Ala Lys Leu Ala 
5 

Glu Arg Ala He 

Glu Cys Leu Glu 

40 

lie Pro Ser Ala 



Lys Asp Arg Glu 
10 

Lys Tyr Leu Asn 
25 

Ala Gly Thr Leu 
Leu Gly Phe Lys 



Ala Ala Glu Gly 
15 

Gin Asp Tyr Glu 
30 

Phe Gin Asp Pro 
45 

Glu Leu Gly Pro 
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SO 



SO 

Tyr Ser Ser Lys 

65 

Cys Ala Asp Pro 



Thr Arg 

70 
Gin Phe 
85 

Leu Gly 



Cys Gin Gly Ala 

100 

Leu Thr Leu Asn Glu Glu 

lis 

Ser Phe Gin Glu Asn Tyr 
130 

Tyr Gly Glu Trp 
145 

Asp Gly Glu Leu 



Val Glu 
150 
Leu Phe 
165 

Glu Lys 



Ser Ala Leu Leu 

180 

Ala Leu Ser Gly Gly Ala 
195 

Gly lie Ala Glu Trp Tyr 
210 

Lys lie lie Gin 
225 

lie Asp lie Thr 



Lys Ala 
230 
Ser Ala 
245 

His Ala 



Leu Val Lys Gly 

260 

Ser Asn Gly Ser Leu Gin 
275 

Glu Val Glu Trp Thr Gly 
290 

Thr lie Asp Pro 
305 

Gly Glu Phe Trp 



Glu Glu 
310 
Met: SeE 
325 

Asn Leu 



Leu Thr 



Leu Glu lie Cys 

340 

Lys Lys Trp Lys 
355 

Thr Ala Gly Gly Cys Arg 
370 

Gin Tyr Leu lie Lys Leu 



55 

Gly Met Arg 

He He Gly 

Asp Cys Trp 
105 

He Leu Ala 

120 
Ala Gly He 
135 

Val Val Val 
Val His Ser 

Ala Tyr Ala 

185 

Thr Thr Glu 

200 
Glu I«eu Lys 
215 

Leu Gin Lys 

Ala Asp Ser 

Tyr Ser Val 
265 

Lys Leu He 

280 
Arg Trp Asn 
295 

Arg Glu Arg 

Phe Ser Asp 

Thr Pro Asp 
345 

Lys Met Asp 

360 
Asn Tyr Pro 
375 

Glu Glu Glu 



Trp Lys 

75 
Gly Ala 
90 

Leu Leu 



60 
Arg 

Thr 

Ala 



Arg Val Val 



Phe His 

Asp Asp 
155 
Ala Glu 
170 

Lys He 



Phe 
140 
Arg 

Gly 

Asn 



Gly Phe Glu 



Lys Pro 

Gly Ser 
235 
Glu Ala 
250 

Thr Gly 



Pro 
220 
Leu 

He 

Ala 



Arg He Arg 



Asp Asn 

Leu Thr 
315 
Phe Leu 
330 

Thr Leu 



Cys 
3O0 
Arg 

Arg 

Thr' 



Gly Asn Trp 



Asn Thr 



Asp Glu 



Phe 
380 
Asp 



Pro Thr Glu He 

80 

Arg Thr Asp He 
95 

Ala He Ala Ser 
110 

Pro Leu Asn Gin 
125 

Gin Phe Trp Gin 

Leu Pro Thr Lys 

160 

Ser Glu Phe Trp 

175 

Gly Cys Tyr Glu 

190 

Asp Phe Thr Gly 
205 

Pro Asn I«eu Phe 

Leu Gly Cys Ser 

240 

Thr Phe Gin Lys 
255 

Glu Glu Val Glu 
270 

Asn Pro Trp Gly 
285 

Pro Ser Trp Asn 

Arg His Glu Asp 

320 

His Tyr Ser Arg 
335 

Ser Asp Thr Tyr 
350 

Arg Arg Gly Ser 
365 

Trp Met Asn Pro 
Glu Glu Asp Gly 
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385 
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395 



400 



10 



IS 



20 



25 



30 



3$ 



40 



45 



SO 



Glu Ser Gly Cys Thr Phe Leu Val Gly Leu He Gin Lys His Arg Arg 

405 410 415 

Arg Gin Arg Lys Met Gly Glu Asp Met His Thr He Gly Phe Gly He 

420 425 430 

Tyr Glu Val Pro Glu Glu Leu Ser Gly Gin Thr Asn He His Leu Ser 

435 440 445 

Lys Asn Phe Phe Leu Thr Asn Arg Ala Arg Glu Arg Ser Asp Thr Phe 

450 455 460 

He Asn Leu Arg Glu Val Leu Asn Arg Phe Lys Leu Pro Pro Gly Glu 
465 470 475 480 

Tyr He Leu Val Pro Ser Thr Phe Glu Pro Asn Lys Asp Gly Asp Phe 

485 490 495 

Cys He Arg Val Phe Ser Glu Lys Lys Ala Asp Tyr Gin Ala Val Asp 

500 505 510 

Asp Glu He Glu Ala Asn Leu Glu Glu Phe Asp He Ser Glu Asp Asp 

515 520 525 

He Asp Asp Gly Val Arg Arg Leu Phe Ala Gin Leu Ala Gly Glu Asp 

530 535 540 

Ala Glu He Ser Ala Phe Glu Leu Gin Thr He Leu Arg Arg Val Leu 
545 550 555 560 

Ala Lys Arg Gin Asp He Lys Ser Asp Gly Phe Ser He Glu Thr Cys 

565 570 575 

Lys He Met Val Asp Met Leu Asp Ser Asp Gly Ser Gly Lys Leu Gly 

580 585 590 

Leu Lys Glu' Phe Tyr He Leu Trp Thr Lys He Gin Lys Tyr Gin Lys 

595 600 605 

He Tyr Arg Glu He Asp Val Asp Arg Ser Gly Thr Met Asn Ser Tyr 

610 615 620 

Glu Met Arg Lys Ala Leu Glu Glu Ala Gly Phe Lys Met Pro Cys Gin 
625 630 635 640 

Leu His Gin Val He Val Ala Arg Phe Ala Asp Asp Gin Leu He He 

645 650 655 

Asp Phe Asp Asn Phe Val Arg Cys Leu Val Arg ]Jeu Glu Thr Leu Phe 

660 665 670 

Lys He Phe Lys Gin Leu Asp Pro Glu Asn Thr Gly Thr He Glu Leu 

675 680 685 

Asp Leu He Ser Trp Leu Cys Phe Ser Val Leu 
690 695 



<210> 24 
<211>221 
<212> PRT 



60 



IS 



20 
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<213> Tobacco etch vims 
<400> 24 

5 

Gly Glu Ser Leu Phe Lys Gly Pro Arg Asp Tyr Asn Pro He Ser Ser 
IS 10 15 

Thr He Cys His Leu Thr Asn Glu Ser Asp Gly His Thr Thr Ser Leu 
10 20 2S 30 

Tyr Gly He Gly Phe Gly Pro Phe He He Thr Asn Lys His Leu Phe 

35 40 45 

Arg Arg Asn Asn Gly Thr Leu Leu Val Gin Ser Leu His Gly Val Phe 

50 55 60 

Lys Val Lys Asn Thr Thr Thr Leu Gin Gin His Leu He Asp Gly Arg 
65 70 75 80 

Asp Met He He He Arg Met Pro Lys Asp Phe Pro Pro Phe Pro Gin 

85 90 95 

Lys Leu Lys Phe Arg Glu Pro Gin Arg Glu Glu Arg He Cys Leu Val 

100 105 110 

25 Thr Thr Asn Phe Gin Thr Lys Ser Met Ser Ser Met Val Ser Asp Thr 

115 120 125 

Ser Cys Thr Phe Pro Ser Ser Asp Gly He Phe Trp Lys His Trp He 

130 135 140 

Gin Thr Lys Asp Gly Gin Cys Gly Ser Pro Leu Val Ser Thr Arg Asp 
145 150 155 160 

Gly Phe He Val Gly He His Ser Ala Ser Asn Phe Thr Asn Thr Asn 

165 170 175 

Asn Tyr Phe Thr Ser Val Pro Lys Asn Phe Met Glu Leu Leu Thr Asn 

180 185 190 

Gin Glu Ala Gin Gin Trp Val Ser Gly Trp Arg Leu Asn Ala Asp Ser 

195 200 205 

Val Leu Trp Gly Gly His Lys val Phe Met Asp Lys Pro 
210 215 220 

45 <210>25 
<211>371 
<212> PRT 

<213> Streptococcus pyogenes 
so <400> 25 



30 



35 



40 



Asp Gin Asn Phe Ala Arg Asn Glu Lys Glu Ala Lys Asp Ser Ala He 
15 10 15 
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Thr Phe lie Gin Lys Ser Ala Ala lie Lys Ala Gly Ala Arg Ser Ala 

20 25 30 

g Glu Asp lie Lys Leu Asp Lys Val Asn Leu Gly Gly Glu Leu Ser Gly 

35 40 45 

Ser Asn Met Tyr Val Tyr Asn lie Ser Thr Gly Gly Phe Val lie Val 

50 55 60 

Ser Gly Asp Lys Arg Ser Pro Glu lie Leu Gly Tyr Ser Thr Ser Gly 
65 70 75 80 

Ser Phe Asp Val Asn Gly Lys Glu Asn lie Ala Ser Phe Met Glu Ser 

85 90 95 

Tyr Val Glu Gin lie Lys Glu Asn Lys Lys Leu Asp Ser Thr Tyr Ala 

100 105 110 

Gly Thr Ala Glu lie Lys Gin Pro Val Val Lys Ser Leu Leu Asp Ser 
115 120 125 

20 I'ys Gly lie His Tyr Asn Gin Gly Asn Pro Tyr Asn Leu Leu Thr Pro 

130 135 140 

Val He Glu Lys Val Lys Pro Gly Glu Gin Ser Phe Val Gly Gin His 
145 150 155 160 

2s Ala Ala Thr Gly Ser Val Ala Thr Ala Thr Ala Gin He Met Lys Tyr 

165 170 175 

His Asn Tyr Pro Asn Lys Gly Leu Lys Asp Tyr Thr Tyr Thr Leu Ser 

180 185 190 

30 Ser Asn Asn Pro Tyr Phe Asn His Pro Lys Asn Leu Phe Ala Ala He 

195 200 205 

Ser Thr Arg Gin Tyr Asn Trp Asn Asn He Leu Pro Thr Tyr Ser Gly 
210 215 220 

3S Arg Glu Ser Asn Val Gin Lys Met Ala He Ser Glu Leu Met Ala Asp 

225 230 235 240 

Val Gly He Ser Val Asp Net Asp Tyr Gly Pro Ser Ser Gly Ser Ala 

245 250 255 

40 Gly Ser Ser Arg Val Gin Arg Ala Leu Lys Glu Asn Phe Gly Tyr Asn 

260 265 270 

Gin Ser Val His Gin He Asn Arg Gly Asp Phe Ser Lys Gin Asp Trp 
275 280 285 

45 Glu Ala Gin He Asp Lys Glu Leu Ser Gin Asn Gin Pro Val Tyr Tyr 

290 295 300 

Gin Gly Val Gly Lys Val Gly Gly His Ala Phe Val He Asp Gly Ala 
305 310 315 320 

50 Asp Gly Arg Asn Phe Tyr His Val Asn Trp Gly Trp Gly Gly Val Ser 

325 330 335 

Asp Gly Phe Phe Arg Leu Asp Ala Leu Asn Pro Ser Ala Leu Gly Thr 

340 345 350 
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Gly Gly Gly Ala Gly Gly Phe Asn Gly Tyr Gin Ser Ala Val Val Gly 

355 360 365 

lie Lys Pro 
370 



<210> 26 
10 <211>353 
<212> PRT 
<213> Homo sapiens 



<400> 26 
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Lys Lys His Thr Gly Tyr Val Gly Leu Lys Asn Gin Gly Ala Thr Cys 
15 10 15 

g Tyr Met Asn Ser Leu Leu Gin Thr Leu Phe Phe Thr Asn Gin Leu Arg 

20 25 30 

Lys Ala Val Tyr Met Met Pro Thr Glu Gly Asp Asp Ser Ser Lys Ser 
35 40 45 

10 Val Pro Leu Ala Leu Gin Arg Val Phe Tyr Glu Leu Gin His Ser Asp 

50 55 60 

Lys Pro Val Gly Thr Lys Lys Leu Thr Lys Ser Phe Gly Trp Glu Thr 
65 70 75 80 

Leu Asp Ser Phe Met Gin His Asp Val Gin Glu Leu Cys Arg Val Leu 

85 90 95 

Leu Asp Asn Val Glu Asn Lys Met Lys Gly Thr Cys Val Glu Gly Thr 

100 105 110 

lie Pro Lys Leu Phe Arg Gly Lys Met Val Ser Tyr lie Gin Cys Lys 

115 120 125 

Glu Val Asp Tyr Arg Ser Asp Arg Arg Glu Asp Tyr Tyr Asp lie Gin 
25 130 135 140 

Leu Ser He Lys Gly Lys Lys Asn He Phe Glu Ser Phe Val Asp Tyr 
145 150 155 160 

Val Ala Val Glu Gin Leu Asp Gly Asp Asn Lys Tyr Asp Ala Gly Glu 

165 170 175 

His Gly Leu Gin Glu Ala Glu Lys Gly Val Lys Phe Leu Thr Leu Pro 

180 185 190 

Pro Val Leu His Leu Gin Leu Met Arg Phe Met Tyr Asp Pro Gin Thr 

195 200 205 

Asp Gin Asn He Lys He Asn Asp Arg Phe Glu Phe Pro Glu Gin Leu 

210 215 220 

Pro Leu Asp Glu Phe Leu Gin Lys Thr Asp Pro Lys Asp Pro Ala Asn 
225 230 235 240 
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10 



Tyr lie Leu His Ala Val Leu Val His Ser Gly Asp Asn His Gly Gly 

245 250 255 

His Tyr Val Val Tyr Leu Asn Pro Lys Gly Asp Gly.Lys Trp Cys Lys 

260 265 270 

Phe Asp Asp Asp Val Val Ser Arg Cys Thr Lys Glu Glu Ala lie Glu 

275 260 285 

His Asn Tyr Gly Gly His Asp Asp Asp Leu Ser Val Arg His Cys Thr 

290 295 300 

Asn Ala Tyr Met Leu Val Tyr lie Arg Glu Ser Lys Leu Ser Glu Val 
305 310 315 320 

Leu Gin Ala Val Thr Asp His Asp lie Pro Gin Gin Leu Val Glu Arg 

325 330 335 

Leu Gin Glu Glu Lys Arg lie Glu Ala Gin Lys Arg Lys Glu Arg Gin 
20 340 345 350 

Glu 



<210> 27 
25 <211>174 
<212> PRT 

<213> Staphylococcus aureus 



15 



30 



35 



45 



50 



<400> 27 



Tyr Asn Glu Gin Tyr Val Asn Lys Leu Glu Asn Phe Lys lie Arg Glu 
15 10 15 

Thr Gin Gly Asn Asn Gly Trp Cys Ala Gly Tyr Thr Met Ser Ala Leu 

20 25 30 

Leu Asn Ala Thr Tyr Asn Thr Asn Lys Tyr His Ala Glu Ala Val Met 
35 40 45 

40 Arg Phe Leu His Pro Asn Leu Gin Gly Gin Gin Phe Gin Phe Thr Gly 

50 55 60 

Leu Thr Pro Arg Glu Met lie Tyr Phe Gly Gin Thr Gin Gly Arg Ser 
65 70 75 80 

Pro Gin Leu Leu Asn Arg Met Thr Thr Tyr Asn Glu Val Asp Asn Leu 

85 90 95 

Thr Lys Asn Asn Lys Gly lie Ala lie Leu Gly Ser Arg Val Glu Ser 

100 105 110 

Arg Asn Gly Met His Ala Gly His Ala Met Ala Val Val Gly Asn Ala 

115 120 125 

Lys Leu Asn Asn Gly Gin Glu Val lie He He Trp Asn Pro Trp Asp 
55 130 135 140 
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Asn Gly Phe Met Thr Gin Asp Ala Lys Asn Asn Val lie Pro Val ser 
145 150 155 160 

Asn Gly Asp His Tyc Gin Trp Tyr Ser Ser lie Tyr Gly Tyr 

165 170 



<210> 28 
10 <211>221 
<212> PRT 

<213> Saccharomyoes cerevisiae 



<400> 28 



Gly Ser Leu Val Pro Glu Leu Asn Glu Lys Asp Asp Asp Gin Val Gin 
15 10 15 

Lys Ala Leu Ala Ser Arg Glu Asn Thr Gin Leu Met Asn Arg Asp Asn 

20 25 30 

lie Glu lie Thr Val Arg Asp Phe Lys Thr Leu Ala Pro Arg Arg Trp 
35 40 45 

2S I'Qu Asn Asp Thr He He Glu Phe Phe Met Lys Tyr He Glu Lys Ser 

50 55 60 

Thr Pro Asn Thr Val Ala Phe Asn Ser Phe Phe Tyr Thr Asn Leu Ser 
65 70 75 80 

Glu Arg Gly Tyr Gin Gly Val Arg Arg Trp Met Lys Arg Lys Lys Thr 

85 90 95 

Gin He Asp Lys Leu Asp Lys He Phe Thr Pro He Asn Leu Asn Gin 

100 105 110 

Ser His Trp Ala Leu Gly He He Asp Leu Lys Lys Lys Thr He Gly 

115 120 125 

Tyr Val Asp Ser Leu Ser Asn Gly Pro Asn Ala Met Ser Phe Ala He 

130 135 140 

Leu Thr Asp Leu Gin Lys Tyr Val Met Glu Glu Ser Lys His Thr He 
145 150 155 160 

Gly Glu Asp Phe Asp Leu He His Leu Asp Cys Pro Gin Gin Pro Asn 
45 165 170 175 

Gly Tyr Asp Cys Gly He Tyr Val Cys Met Asn Thr Leu Tyr Gly Ser 

180 185 190 

Ala Asp Ala Pro Leu Asp Phe Asp Tyr Lys Asp Ala He Arg Met Arg 

195 200 205 

Arg Phe He Ala His heu He Leu Thr Asp Ala Leu Lys 
210 215 220 



<210> 29 
<211> 166 
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<212> PRT 

<213> Pyrococcus horikoshii 



<400> 29 



Met Lys Val Leu Phe Leu Thr Ala Asn Glu Phe Glu Asp Val Glu Leu 

1 5 10 15 

lie Tyr Pro Tyr His Arg Leu Lys Glu Glu Gly His Glu Val Tyr lie 

20 25 30 

Ala Ser Phe Glu Arg Gly Thr lie Thr Gly Lys His Gly Tyr Ser Val 

35 40 45 

Lys Val Asp Leu Thr Phe Asp Lys Val Asn Pro Glu Glu Phe Asp Ala 



Leu Val Leu Pro Gly Gly Arg Ala Pro Glu Arg Val Arg Leu Asn Glu 
65 70 75 80 

Lys Ala Val Ser lie Ala Arg Lys Met Phe Ser Glu Gly Lys Pro Val 

85 90 95 

Ala Ser lie Cys His Gly Pro Gin lie Leu lie Ser Ala Gly Val Leu 

100 105 110 

Arg Gly Arg Lys Gly Thr Ser Tyr Pro Gly lie Lys Asp Asp Met lie 

115 120 125 

Asn Ala Gly Val Glu Trp Val Asp Ala Glu Val Val Val Asp Gly Asn 

130 135 140 

Trp Val Ser Ser Arg Val Pro Ala Asp Leu Tyr Ala Trp Met Arg Glu 
145 150 155 160 

Phe Val Lys Leu Leu Lys 



<210> 30 
<211>316 
<212> PRT 

<213> Bacillus thenmoproteolyticus 
<400> 30 



lie Thr Gly Thr Ser Thr Val Gly Val Gly Arg Gly Val Leu Gly Asp 

15 10 15 

Gin Lys Asn lie Asn Thr Thr Tyr Ser Thr Tyr Tyr Tyr Leu Gin Asp 

20 25 30 

Asn Thr Arg Gly Asp Gly lie Phe Thr Tyr Asp Ala Lys Tyr Arg Thr 
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Thr Leu 

50 
Ser Tyr 
65 

Tyr Asp 



Pro Gly Ser Leu 



Asp Ala Pro Ala 

70 

Tyr Tyr Lys Asn 
85 

Asn Ala Ala lie Arg Ser 

100 

Ala Phe Trp Asn Gly Ser 
115 

lie Pro Leu Ser 



Trp Ala Asp Ala Asp Asn 

55 60 

Val Asp Ala His Tyr Tyr 

75 

Val His Asn Arg Leu Ser 

90 

His Tyr Ser Gin 
105 

Val Tyr Gly Asp 



Ser Val 



Thr Phe 
130 
Thr His 
145 

Ser Gly 



Glu Met 
120 
Gly Gly 
135 

Tyr Thr 



Ala lie 



Ala Val Thr Asp 

150 

Ala lie Asn Glu 
165 

Glu Phe Tyr Ala Asn Lys Asn Pro 

180 

Tyr Thr Pro Gly lie Ser 
195 

Tyr Gly Asp Pro 



Ala Lys 
210 
Gin Asp 

225 

Ala Tyr 



Asn Gly Gly Val 

230 

Leu lie Ser Gin 
245 

Gly lie Gly Arg Asp Lys 

260 

Gin Tyr Leu Thr Pro Thr 

275 

Ser Ala Thr Asp 



Gly Asp 
200 
Asp His 
215 

His He 



He Asp Val Val 

140 

Ala Gly Leu He 
155 

Ser Asp He Phe 
170 

Asp Trp Glu He 
185 

Ser Leu Arg Ser 



Tyr Ser Lys Arg 

220 

Asn Ser Gly He 

235 

Gly Gly Thr His Tyr Gly 

250 

Leu Gly Lys He Phe Tyr 
265 

Phe Ser Gin Leu 



Val Gin 
290 
Ser Val 
305 



Lys Gin Ala Phe 

310 



Ser Asn 
280 
Leu Tyr 
295 

Asp Ala 



Gly Ser Thr Ser 

300 

Val Gly Val Lys 
315 



Gin Phe Phe Ala 

Ala Gly Val Thr 

80 

Tyr Asp Gly Asn 
95 

Gly Tyr Asn Asn 
110 

Gly Asp Gly Gin 
125 

Ala His Glu Leu 

Tyr Gin Asn Glu 

160 

Gly Thr Leu Val 
175 

Gly Glu Asp Val 
190 

Met Ser Asp Pro 
205 

Tyr Thr Gly Thr 

He Asn Lys Ala 

240 

Val Ser Val Val 

255 

Arg Ala Leu Thr 

270 

Arg Ala Ala Ala 

285 

Gin Glu Val Ala 



50 <210> 31 

<211> 169 
<212> PRT 
<213> Homo sapiens 

55 <400> 31 
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10 



IS 



Val Leu Thr Glu Gly Asn Pro Arg Trp Glu Gin Thr His Leu Thr Tyr 
IS 10 15 

Arg lie Glu Asn Tyr Thr Pro Asp Leu Pro Arg Ala Asp Val Asp His 

20 25 30 

Ala lie Glu Lys Ala Phe Gin Leu Trp Ser Asn Val Thr Pro Leu Thr 

35 40 45 

Phe Thr Lys Val Ser Glu Gly Gin Ala Asp lie Met lie Ser Phe Val 

50 55 60 

Arg Gly Asp His Arg Asp Asn Ser Pro Phe Asp Gly Pro Gly Gly Asn 
65 70 75 80 

Leu Ala His Ala Phe Gin Pro Gly Pro Gly lie Gly Gly Asp Ala His 

85 90 95 

Phe Asp Glu Asp Glu Arg Trp Thr Asn Asn Phe Arg Glu Tyr Asn Leu 
20 100 105 110 

His Arg Val Ala Ala His Glu Leu Gly His Ser Leu Gly Leu Ser His 

115 120 125 

Ser Thr Asp lie Gly Ala Leu Met Tyr Pro Ser Tyr Thr Phe Ser Gly 

130 135 140 

Asp Val Gin Leu Ala Gin Asp Asp He Asp Gly He Gin Ala He Tyr 
145 150 155 160 

Gly Arg Ser Gin Asn Pro Val Gin Pro 

165 



25 



30 



<210> 32 
<211>496 
35 <212>PRT 

<213> Homo sapiens 

<400> 32 

40 

Gin Tyr Ser Pro Asn Thr Gin Gin Gly Arg Thr Ser He Val His Leu 
15 10 15 

Phe Glu Trp Arg Trp Val Asp He Ala Leu Glu Cys Glu Arg Tyr Leu 
4S 20 25 30 

Ala Pro Lys Gly Phe Gly Gly Val Gin Val Ser Pro Pro Asn Glu Asn 

35 40 45 

Val Ala He Tyr Asn Pro Phe Arg Pro Trp Trp Glu Arg Tyr Gin Pro 
» 50 55 60 

Val Ser Tyr Lys Leu Cys Thr Arg Ser Gly Asn Glu Asp Glu Phe Arg 
65 70 75 80 

Asn Met Val Thr Arg Cys Asn Asn Val Gly Val Arg He Tyr Val Asp 

85 90 95 
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50 



Ala Val He Asn His 

100 

Ser Thr Cys Gly Ser 
115 

Val Pro Tyr Ser Gly 
130 

Ser Gly Asp He Glu 
145 

Arg Leu Thr Gly Leu 

165 

Ser Lys He Ala Glu 

180 

Gly Phe Arg Leu Asp 
195 

Ala He Leu Asp Lys 
210 

Gly Ser Lys Pro Phe 
225 

Pro He Lys Ser Ser 

245 

Lys Tyr Gly Ala Lys 

260 

Lys Met Ser Tyr Leu 

275 

Ser Asp Arg Ala Leu 
290 

His Gly Ala Gly Gly 
305 

Tyr Lys Met Ala Val 

325 

Arg Val Met Ser Ser 

340 

Asp Val Asn Asp Trp 
355 

Glu Val Thr He Asn 
370 

Glu His Arg Trp Arg 

385 

Val Asp Gly Gin Pro 

405 

Val Ala Phe Gly Arg 

420 



Met Cys Gly Asn Ala Val 

105 

Pro Gly Ser 



Tyr Phe Asn 
120 

Trp Asp Phe 

135 
Asn Tyr Asn 
150 

Leu Asp Leu 

Tyr Met Asn 

Ala Ser Lys 
200 

Leu His Asn 

215 
He Tyr Gin 
230 

Asp Tyr Phe 

Leu Gly Thr 

Lys Asn Trp 
280 

Val Phe Val 

295 
Ala Ser He 
310 

Gly Phe Met 

Tyr Arg Trp 

Val Gly Pro 

360 
Pro Asp Thr 

375 
Gin He Arg 
390 

Phe Thr Asn 
Gly Asn Arg 



Asn Asp Gly 

Asp Ala Thr 

155 

Ala Leu Glu 

170 - 
His Leu He 
185 

His Met Trp 

Leu Asn Ser 

Glu Val He 
235 

Gly Asn Gly 

250 
Val He Arg 
265 

Gly Glu Gly 

Asp Asn His 

Leu Thr Phe 
315 

Leu Ala His 

330 
Pro Arg Gin 
345 

Pro Asn Asn 
Thr Cys Gly 

Asn Met Val 

395 

Trp Tyr Asp 

410 
Gly Phe He 
425 



Ser Ala Gly Thr Ser 
110 

Arg Asp Phe Pro Ala 
125 

Lys Cys Lys Thr Gly 
140 

Gin Val Arg Asp Cys 

160 

Lys Asp Tyr Val Arg 

175 

Asp He Gly Val Ala 

190 

Pro Gly Asp He Lys 
205 

Asn Trp Phe Pro Ala 
220 

Asp Leu Gly Gly Glu 

240 

Arg Val Thr Glu Phe 

255 

Lys Trp Asn Gly Glu 
270 

Trp Gly Phe Val Pro 

285 

Asp Asn Gin Arg Gly 

300 

Trp Asp Ala Arg Leu 

320 

Pro Tyr Gly Phe Thr 

335 

Phe Gin Asn Gly Asn 
350 

Asn Gly Val He Lys 
365 

Asn Asp Trp Val Cys 
380 

He Phe Arg Asn Val 

400 

Asn Gly Ser Asn Gin 

415 

Val Phe Asn Asn Asp 
430 
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Asp Trp Ser Phe Ser Leu Thr Leu Gin Thr Gly Leu Pro Ala Gly Thr 

435 440 445 

Tyr Cys Asp val lie Ser Gly Asp Lys He Asn Gly Asn Cys Thr Gly 

450 455 460 

He Lys He Tyr Val Ser Asp Asp Gly Lys Ala His Phe Ser He Ser 
465 470 475 480 

Asn Ser Ala Glu Asp Pro Phe He Ala He His Ala Glu Ser Lys Leu 

485 490 495 



<210> 33 
t5 <211>370 
<212> PRT 

<213> Trichoderma reesei 



<400> 33 



Gin Pro Gly Thr Ser Thr Pro Glu Val His Pro Lys Leu Thr Thr Tyr 
15 10 15 

Lys Cys Thr Lys Ser Gly Gly Cys Val Ala Gin Asp Thr Ser Val Val 

20 25 30 

Leu Asp Trp Asn Tyr Arg Trp Met His Asp Ala Asn Tyr Asn Ser Cys 

35 40 45 

Thr Val Asn Gly Gly Val Asn Thr Thr Leu Cys Pro Asp Glu Ala Thr 

50 55 60 

Cys Gly Lys Asn -Cys Phe He Glu Gly Val Asp Tyr Ala Ala Ser Gly 
65 70 75 80 

35 Val Thr Thr Ser Gly Ser Ser Leu Thr Met Asn Gin Tyr Met Pro Ser 

85 90 95 

Ser Ser Gly Gly Tyr Ser Ser Val Ser Pro Arg Leu Tyr Leu Leu Asp 

100 105 110 

Ser Asp Gly Glu Tyr Val Met Leu Lys Leu Asn Gly Gin Glu Leu Ser 

115 120 125 

Phe Asp Val Asp Leu Ser Ala Leu Pro Cys Gly Glu Asn Gly Ser Leu 

130 135 140 

Tyr Leu Ser Gin Met Asp Glu Asn Gly Gly Ala Asn Gin Tyr Asn Thr 
145 150 155 160 

Ala Gly Ala Asn Tyr Gly Ser Gly Tyr Cys Asp Ala Gin Cys Pro Val 
so 165 170 175 

Gin Thr Trp Arg Asn Gly Thr Leu Asn Thr Ser His Gin Gly Phe Cys 

180 185 190 

Cys Asn Glu Met Asp He Leu Glu Gly Asn Ser Arg Ala Asn Ala Leu 
195 200 205 
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IS 



Thr Pro His Ser Cys Thr Ala Thr Ala Cys Asp Ser Ala Gly Cys Gly 

210 215 220 

Phe Asn Pro Tyr Gly Ser Gly Tyr Lys Ser Tyr Tyr Gly Pro Gly Asp 
225 230 235 240 

Thr Val Asp Thr Ser Lys Thr Phe Thr lie He Thr Gin Phe Asn Thr 

245 250 255 

Asp Asn Gly Ser Pro Ser Gly Asn Leu Val Ser He Thr Arg Lys Tyr 

260 265 270 

Gin Gin Asn Gly Val Asp He Pro Ser Ala Gin Pro Gly Gly Asp Thr 

275 280 285 

He Ser Ser Cys Pro Ser Ala Ser Ala Tyr Gly Gly Leu Ala Thr Met 

290 295 300 

Gly Lys Ala Leu Ser Ser Gly Met Val Leu Val Phe Ser He Trp Asn 
20 305 310 315 * 320 

Asp Asn Ser Gin Tyr Met Asn Trp Leu Asp Ser Gly Asn Ala Gly Pro 

325 330 335 

Cys Ser Ser Thr Glu Gly Asn Pro Ser Asn He Leu Ala Asn Asn Pro 

340 345 350 

Asn Thr His Val Val Phe Ser Asn He Arg Trp Gly Asp He Gly Ser 
355 360 365 

Thr Thr 
370 



25 



30 



<210> 34 
<211>223 
35 <212>PRT 

<213> Aspergillus niger 

<400>34 

40 

Gin Thr Met Cys Ser Gin Tyr Asp Ser Ala Ser Ser Pro Pro Tyr Ser 
15 10 15 

Val Asn Gin Asn Leu Trp Gly Glu Tyr Gin Gly Thr Gly Ser Gin Cys 
^ 20 25 30 

Val Tyr Val Asp Lys Leu Ser Ser Ser Gly Ala Ser Trp His Thr Glu 

35 40 45. 

Trp Thr Trp Ser Gly Gly Glu Gly Thr Val Lys Ser Tyr Ser Asn Ser 

50 55 60 

Gly Val Thr Phe Asn Lys Lys Leu Val Ser Asp Val Ser Ser He Pro 
65 70 75 80 

Thr Ser Val Glu Trp Lys Gin Asp Asn Thr Asn Val Asn Ala Asp Val 

85 90 95 



50 



55 



72 
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10 



Ala Tyr Asp Leu Phe Thr Ala Ala Asn Val Asp His Ala Thr Ser Ser 

100 105 110 

Gly Asp Tyr Glu Leu Met lie Trp Leu Ala Arg Tyr Gly Asn He Gin 

115 120 125 

Pro lie Gly Lys Gin He Ala Thr Ala Thr Val Gly Gly Lys Ser Trp 

130 135 140 

Glu Val Trp Tyr Gly Ser Thr Thr Gin Ala Gly Ala Glu Gin Arg Thr 
145 ISO 155 160 

Tyr Ser Phe Val Ser Glu Ser Pro He Asn Ser Tyr Ser Gly Asp He 
IS 165 170 175 

Asn Ala Phe Phe Ser Tyr Leu Thr Gin Asn Gin Gly Phe Pro Ala Ser 

180 185 190 

Ser Gin Tyr Leu He Asn Leu Gin Phe Gly Thr Glu Ala Phe Thr Gly 
^ 195 200 205 

Gly Pro Ala Thr Phe Thr Val Asp Asn Trp Thr Ala Ser Val Asn 
210 215 220 



25 



30 



40 



45 



55 



<210>35 
<211> 184 
<212> PRT 

<213> Aspergillus niger 
<400> 35 



Sec Ala Gly He Asn Tyr Val Gin Asn Tyr Asn Gly Asn Leu Gly Asp 
as 1 5 10 15 

Phe Thr Tyr Asp Glu Ser Ala Gly Thr Phe Ser Met Tyr Trp Glu Asp 

20 25 30 

Gly Val Ser Ser Asp Phe Val Val Gly Leu Gly Trp Thr Thr Gly Ser 

35 40 45 

Ser Asn Ala He Thr Tyr Ser Ala Glu Tyr Ser Ala Ser Gly Ser Ala 

50 55 60 

Ser Tyr Leu Ala Val Tyr Gly Trp Val Asn Tyr Pro Gin Ala Glu Tyr 
65 70 75 80 

Tyr He Val Glu Asp Tyr Gly Asp Tyr Asn Pro Cys Ser Ser Ala Thr 

85 90 95 

50 Ser Leu Gly Thr Val Tyr Ser Asp Gly Ser Thr Tyr Gin Val Cys Thr 

100 105 110 

Asp Thr Arg Thr Asn Glu Pro Ser He Thr Gly Thr Ser Thr Phe Thr 

115 120 125 

Gin Tyr Phe Ser Val Arg Glu Ser Thr Arg Thr Ser Gly Thr Val Thr 
130 135 140 



73 
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Val Ala Asn His Phe Asn Phe Trp Ala His His Gly Phe Gly Asn Ser 
145 150 155 160 

Asp Phe Asn Tyr Gin Val Val Ala Val Glu Ala Trp Ser Gly Ala Gly 

165 170 175 

Ser Ala Ser Val Thr lie Ser Ser 

180 



<210> 36 
<211>313 
<212> PRT 
IS <213> Streptomyces lividans 

<400> 36 



25 



30 



20 Ala Glu Ser Thr Leu Gly Ala Ala Ala Ala Gin Ser Gly Arg Tyr Phe 

15 10 15 

Gly Thr Ala He Ala Ser Gly Arg Leu Ser Asp Ser Thr Tyr Thr Ser 

20 25 30 

He Ala Gly Arg Glu Phe Asn Met Val Thr Ala Glu Asn Glu Met Lys 

35 40 45 

He Asp Ala Thr Glu Pro Gin Arg Gly Gin Phe Asn Phe Ser Ser Ala 

50 55 60 

Asp Arg Val Tyr Asn Trp Ala Val Gin Asn Gly Lys Gin Val Arg Gly 
65 70 75 80 

His Thr Leu Ala Trp His Ser Gin Gin Pro Gly Trp Met Gin Ser Leu 
35 85 90 95 

Ser Gly Ser Ala Leu Arg Gin Ala Met He Asp His He Asn Gly Val 

100 105 110 

Met Ala His Tyr Lys Gly Lys He Val Gin Trp Asp Val Val Asn Glu 

115 120 125 

Ala Phe Ala Asp Gly Ser Ser Gly Ala Arg Arg Asp Ser Asn Leu Gin 

130 135 140 

Arg Ser Gly Asn Asp Trp He Glu Val Ala Phe Arg Thr Ala Arg Ala 
145 150 155 160 

Ala Asp Pro Ser Ala Lys Leu Cys Tyr Asn Asp Tyr Asn Val Glu Asn 

165 170 175 

Trp Thr Trp Ala Lys Thr Gin Ala Met Tyr Asn Met Val Arg Asp Phe 

180 185 190 

Lys Gin Arg Gly Val Pro He Asp Cys Val Gly Phe Gin Ser His Phe 
195 200 205 

^ Asn Ser Gly Ser Pro Tyr Asn Ser Asn Phe Arg Thr Thr Leu Gin Asn 

210 215 220 



40 



45 



50 



74 



EP 1 633 865 B1 



10 



15 



Phe Ala Ala lieu Gly Val Asp Val Ala He Thr Glu Leu Asp He Gin 
225 230 235 240 

S Gly Ala Pro Ala Ser Thr Tyr Ala Asn Val Thr Asn Asp Cys Leu Ala 

245 250 255 

Val Ser Arg Cys Leu Gly He Thr Val Trp Gly Val Arg Asp Ser Asp 

260 265 270 

Ser Trp Arg Ser Glu Gin Thr Pro Leu Leu Phe Asn Asn Asp Gly Ser 

275 280 285 

Lys Lys Ala Ala Tyr Thr Ala Val Leu Asp Ala Leu Asn Gly Gly Ala 

290 295 300 

Ser Ser Glu Pro Pro Ala Asp Gly Gly 
305 310 

20 <210>37 
<211>362 
<212> PRT 

<213> Aspergillus niger 
25 <400> 37 

Met His Ser Phe Ala Ser Leu Leu Ala Tyr Gly Leu Val Ala Gly Ala 
1 5 10 * 15 

Thr Phe Ala Ser Ala Ser Pro He Glu Ala Arg Asp Ser Cys Thr Phe 

20 25 30 

Thr Thr Ala Ala Ala Ala Lys Ala Gly Lys Ala Lys Cys Ser Thr He 

35 40 45 

Thr Leu Asn Asn He Glu Val Pro Ala Gly Thr Thr Leu Asp hen Thr 

50 55 60 

Gly Leu Thr Ser Gly Thr Lys Val He Phe Glu Gly Thr Thr Thr Phe 
65 70 75 80 

Gin Tyr Glu Glu Trp Ala Gly Pro Leu He Ser Net Ser Gly Glu His 

85 90 95 

He Thr Val Thr Gly Ala Ser Gly His Leu He Asn Cys Asp Gly Ala 
^ 100 105 110 

Arg Trp Trp Asp Gly Lys Gly Thr Ser Gly Lys Lys Lys Pro Lys Phe 

115 120 125 

Phe Tyr Ala His Gly Leu Asp Ser Ser Ser He Thr Gly Leu Asn He 

130 135 140 

Lys Asn Thr Pro Leu Met Ala Phe Ser Val Gin Ala Asn Asp He Thr 
145 150 155 160 

Phe Thr Asp Val Thr He Asn Asn Ala Asp Gly Asp Thr Gin Gly Gly 

165 170 175 



30 



35 



40 



75 
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His Asn Thr 

lie Lys Pro 
19S 

Gly Glu Asn 

210 
Leu Ser He 
22S 

Val Thr He 

He Lys Thr 

Ser Asn He 
275 

Gin Asp Tyr 

290 
Thr He Gin 
305 

Gly Ala Thr 

Trp Thr Trp 

Cys Lys Asn 
355 



Asp Ala Phe 
180 

Trp Val His 

lie Trp Phe 

Gly Ser Val 
230 

Glu His Ser 

245 
He Ser Gly 
260 

Val Met Ser 

Glu Asp Gly 

Asp Val Lys 
310 

Glu He Tyr 

325 
Asp Asp Val 
340 

Phe Pro Ser 



Asp Val 

Asn Gin 
200 
Thr Gly 
215 

Gly Asp 

Thr Val 

Ala Thr 

Gly He 
280 
Lys Pro 
295 

Leu Glu 

Leu Leu 

Lys Val 

Val Ala 
360 



Gly Asn Ser 
185 

Asp Asp Cys 

Gly Thr Cys 

Arg Ser Asn 
235 

Ser Asn Ser 

250 
Gly Ser Val 
265 

Ser Asp Tyr 

Thr Gly Lys 

Ser Val Thr 
315 

Cys Gly Ser 

330 
Thr Gly Gly 
345 

Ser Cys 



Val Gly Val Asn He 
190 

Leu Ala Val Asn Ser 
205 

He Gly Gly His Gly 
220 

Asn Val Val Lys Asn 

240 

Glu Asn Ala Val Arg 

255 

Ser Glu He Thr Tyr 
270 

Gly Val Val He Gin 
285 

Pro Thr Asn Gly Val 
300 

Gly Ser Val Asp Ser 

320 

Gly Ser Cys Ser Asp 

335 

Lys Lys Ser Thr Ala 
350 



<210> 38 
<211> 383 
<212> PRT 

<213> Pseudomonas cellulose 
<400> 38 



Arg Ala Asp Val 
1 

Thr Met Glu Thr 

20 

His Ser He Met 
35 

He Thr Arg Thr 
50 

Asp Phe Ala Ala 
65 



Lys Pro Val Thr 
5 

Arg Ser Leu Phe 

Phe Gly His Gin 

40 

Asp Gly Thr Gin 
55 

Val Tyr Gly Trp 
70 



Val Lys Leu Val 
10 

Ala Phe Met Gin 
25 

His Glu Thr Thr 

Ser Asp Thr Phe 

60 

Asp Thr Leu Ser 
75 



Asp Ser Gin Ala 
15 

Glu Gin Arg Arg 
30 

Gin Gly Leu Thr 
45 

Asn Ala Val Gly 

He Val Ala Pro 

80 



76 
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Lys Ala Glu Gly 

Gly Gly lie lie 

100 

Thr Gin Lys Gly 
115 

Ala Val Val Asp 
130 

Gly Tyr Leu Asp 
145 

Gin Gly Arg Leu 

Gly Ser Trp Phe 

180 

Lys Gin Leu Phe 
195 

Val Arg Asn Phe 
210 

Thr Glu Ala Asn 
225 

Val Leu Gly Phe 

Phe Arg Asn Val 

260 

Ala Arg Gly Lys 
275 

Asp lie Glu Ala 

290 

Ser Gly Leu Lys 
305 

Val Trp Arg Asn 

Val Pro His Tyr 

340 

Gly Thr Leu Glu 

355 

Phe Asn Arg Asp 
370 



Asp lie Val Ala 
85 

Thr Val Ser Ser 

Val Trp Pro Val 

120 

Ser Leu Pro Gly 
135 

Gin Val Ala Glu 
150 

lie Pro Val lie 
165 

Trp Trp Gly Asp 

Arg Tyr Ser Val 

200 

Leu Tyr Ala Tyr 
215 

Tyr Leu Glu Arg 
230 

Asp Thr Tyr Gly 
245 

Val Ala Asn Ala 

lie Pro Val He 

280 

Gly Leu Tyr Asp 
295 

Ala Asp Pro Asp 

310 

Ala Pro Gin Gly 
325 

Trp Val Pro Ala 

Asp Phe Gin Ala 

360 

lie Glu Gin Val 
375 



Gin Val Lys Lys 
90 

His Phe Asp Asn 
105 

Gly Thr Ser Trp 

Gly Ala Tyr Asn 

140 

Trp Ala Asn Asn 
155 

Phe Arg Leu Tyr 
170 

Lys Gin Ser Thr 
185 

Glu Tyr Leu Arg 

Ser Pro Asn Asn 

220 

Tyr Pro Gly Asp 
235 

Pro Val Ala Asp 
250 

Ala Leu Val Ala 
265 

Ser Glu He Gly 

Asn Gin Trp Tyr 

300 

Ala Arg Glu He 
315 

Val Pro Gly Pro 
330 

Asn Arg Pro Glu 
345 

Phe Tyr Ala Asp 

Tyr Gin Arg Pro 

380 



Ala Tyr Ala Arg 
95 

Pro Lys Thr Asp 
110 

Asp Gin Thr Pro 
125 

Pro Val Leu Asn 

Leu Lys Asp Glu 

160 

His Ala Asn Thr 
175 

Pro Glu Gin Tyr 
190 

Asp Val Lys Gly 
205 

Phe Trp Asp Val 

Glu Trp Val Asp 

240 

Asn Ala Asp Trp 
255 

Arg Met Ala Glu 
270 

He Arg Ala Pro 
285 

Arg Lys Leu He 

Ala Phe Leu Leu 

320 

Asn Gly Thr Gin 
335 

Asn He Asn Asn 
350 

Glu Phe Thr Ala 
365 

Thr Leu He 



<210> 39 
<211>419 
<212> PRT 

<213> Bacillus drculans 



77 



10 



15 



20 
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<400> 39 

Leu Gin Pro Ala Thr Ala Glu Ala Ala Asp Ser Tyr Lys lie Val Gly 
15 10 IS 

Tyr Tyr Pro Ser Trp Ala Ala Tyr Gly Arg Asn Tyr Asn Val Ala Asp 

20 25 30 

lie Asp Pro Thr Lys Val Thr His lie Asn Tyr Ala Phe Ala Asp lie 

35 40 45 

Cys Trp Asn Gly lie His Gly Asn Pro Asp Pro Ser Gly Pro Asn Pro 

50 55 60 

Val Thr Trp Thr Cys Gin Asn Glu Lys Ser Gin Thr lie Asn Val Pro 
65 70 75 80 

Asn Gly Thr lie Val Leu Gly Asp Pro Trp lie Asp Thr Gly Lys Thr 

85 90 95 

Phe Ala Gly Asp Thr Trp Asp Gin Pro lie Ala Gly Asn lie Asn Gin 

100 105 110 

Leu Asn Lys Leu Lys Gin Thr Asn Pro Asn Leu Lys Thr lie lie Ser 

115 120 125 

Val Gly Gly Trp Thr Trp Ser Asn Arg Phe Ser Asp Val Ala Ala Thr 

130 135 140 

Ala Ala Thr Arg Glu Val Phe Ala Asn Ser Ala Val Asp Phe Leu Arg 
145 150 155 160 

Lys Tyr Asn Phe Asp Gly Val Asp Leu Asp Trp Glu Tyr Pro Val Ser 

165 170 175 

Gly Gly Leu Asp Gly Asn Ser Lys Arg Pro Glu Asp Lys Gin Asn Tyr 

180 185 190 

Thr Leu I<eu Leu Ser Lys lie Arg Glu Lys Leu Asp Ala Ala Gly Ala 

195 200 205 

Val Asp Gly Lys Lys Tyr Leu Leu Thr He Ala Ser Gly Ala Ser Ala 
40 210 215 220 

Thr Tyr Ala Ala Asn Thr Glu Leu Ala Lys He Ala Ala He Val Asp 
225 230 235 240 

Trp He Asn He Met Thr Tyr Asp Phe Asn Gly Ala Trp Gin Lys He 

245 250 255 

Ser Ala His Asn Ala Pro Leu Asn Tyr Asp Pro Ala Ala Ser Ala Ala 

260 265 270 

Gly Val Pro Asp Ala Asn Thr Phe Asn Val Ala Ala Gly Ala Gin Gly 

275 280 285 

His Leu Asp Ala Gly Val Pro Ala Ala Lys Leu Val Leu Gly Val Pro 
290 295 300 

55 



30 



35 



45 



50 



78 



Phe Tyr Gly Arg Gly 
305 

Tyr Gin Thr Cys Thr 

325 

Ser Phe Asp Phe Tyr 

340 

Tyr Thr Arg Tyr Trp 

355 

Ala Ser Asn Lys Arg 
370 

Tyr Lys Thr Ala Tyr 
385 

Trp Glu Leu Ser Gly 

405 

Ala Asp Leu 
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Trp Asp Gly Cys Ala Gin 
310 315 
Gly Gly Ser Ser Val Gly 

330 

Asp Leu Glu Ala Asn Tyr 

345 

Asn Asp Thr Ala Lys Val 
360 

Phe lie Ser Tyr Asp Asp 
375 

lie Lys Ser Lys Gly Leu 
390 395 
Asp Arg Asn Lys Thr Leu 

410 




Ala Gly Asn Gly Gin 

320 

Thr Trp Glu Ala Gly 

335 

He Asn Lys Asn Gly 
350 

Pro Tyr Leu Tyr Asn 
365 

Ala Glu Ser Val Gly 
380 

Gly Gly Ala Met Phe 

400 

Gin Asn Lys Leu Lys 

415 



<210> 40 
<211>317 
<212> PRT 

<213> Candida antarctica 
<400> 40 



Leu Pro Ser Gly 
1 

Asp Ala Gly Leu 

20 

Pro He Leu Leu 
35 

Asp Ser Asn Trp 
50 

Trp He Ser Pro 
65 

Glu Tyr Met Val 

Asn Lys Leu Pro 

100 

Trp Gly Leu Thr 
115 

Met Ala Phe Ala 
130 



Ser Asp Pro Ala 
5 

Thr Cys Gin Gly 

Val Pro Gly Thr 

40 

He Pro Leu Ser 
55 

Pro Pro Phe Met 
70 

Asn Ala He Thr 
85 

Val Leu Thr Trp 

Phe Phe Pro Ser 

120 

Pro Asp Tyr Lys 
135 



Phe Ser Gin Pro 
10 

Ala Ser Pro Ser 
25 

Gly Thr Thr Gly 

Thr Gin Leu Gly 

60 

Leu Asn Asp Thr 
75 

Ala Leu Tyr Ala 
90 

Ser Gin Gly Gly 
105 

He Arg Ser Lys 

Gly Thr Val Leu 

140 



Lys Ser Val Leu 
15 

Ser Val Ser Lys 
30 

Pro Gin Ser Phe 
45 

Tyr Thr Pro Cys 

Gin Val Asn Thr 

80 

Gly Ser Gly Asn 
95 

Leu Val Ala Gin 
110 

Val Asp Arg Leu 
125 

Ala Gly Pro Leu 



79 
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10 



15 



Asp Ala Leu AXa Val Ser Ala Pro Set Val Trp Gin Gin Thr Thr Gly 
145 150 155 160 

Ser Ala Leu Thr Thr Ala Leu Arg Asn Ala Gly Gly Leu Thr Gin He 

165 170 175 

Val Pro Thr Thr Asn Leu Tyr Ser Ala Thr Asp Glu lie Val Gin Pro 

180 185 190 

Gin Val Ser Asn Ser Pro Leu Asp Ser Ser Tyr Leu Phe Asn Gly Lys 

195 200 205 

Asn Val Gin Ala Gin Ala Val Cys Gly Pro Leu Phe Val He Asp His 

210 215 220 

Ala Gly Ser Leu Thr Ser Gin Phe Ser Tyr Val Val Gly Arg Ser Ala 
225 230 235 240 

Leu Arg Ser Thr Thr Gly Gin Ala Arg Ser Ala Asp Tyr Gly He Thr 
20 245 250 255 

Asp Cys Asn Pro Leu Pro Ala Asn Asp Leu Thr Pro Glu Gin Lys Val 

260 265 270 

Ala Ala Ala Ala Leu Leu Ala Pro Ala Ala Ala Ala He Val Ala Gly 

275 280 285 

Pro Lys Gin Asn Cys Glu Pro Asp Leu Met Pro Tyr Ala Arg Pro Phe 

290 295 300 

Ala Val Gly Lys Arg Thr Cys Ser Gly He Val Thr Pro 
305 310 315 



25 



30 



<210> 41 
35 <211>434 
<212> PRT 

<213> artificiai sequence 
<220> 

<223> chimera of guinea pig and homo sapiens (human= approx. last 30 am ino acids) 



40 



45 



50 



<400> 41 



Ala Glu Val Cys Tyr Ser His Leu Gly Cys Phe Ser Asp Glu Lys Pro 
15 10 15 

Trp Ala Gly Thr Ser Gin Arg Pro He Lys Ser Leu Pro Ser Asp Pro 

20 25 30 

Lys Lys He Asn Thr Arg Phe Leu Leu Tyr Thr Asn Glu Asn Gin Asn 

35 40 45 

Ser Tyr Gin Leu He Thr Ala Thr Asp He Ala Thr He Lys Ala Ser 
50 55 60 

55 Asn Phe Asn Leu Asn Arg Lys Thr Arg Phe He He His Gly Phe Thr 
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65 

Asp Ser Gly Glu 

Gin Val Glu Lys 

100 

Lys Ala Gin Tyr 
115 

Glu Val Ala Tyr 
130 

Pro Glu Asn Val 
145 

Gly Glu Ala Gly 

Leu Asp Pro Ala 

180 

Leu Asp Pro Ser 
195 

Ser Pro lie Leu 
210 

His Met Asp Phe 

225 

Thr Gly He Ser 

Ser He Leu Asn 

260 

Asp Glu Phe Gin 
275 

Pro Lys Met Gly 
290 

Val Glu Gin Thr 
305 

Arg Trp Arg Tyr 

Ser Gly Asn He 

340 

Gin Tyr Gin Val 
355 

Asn Ser He Asp 
370 

Phe Leu Trp Lys 
385 

Ala Ser Arg He 



70 

Asn Ser Trp Leu 
85 

Val Asn Cys He 

Ser Gin Ala Ser 

120 

Leu Val Gin Val 
135 

His He He Gly 
150 

Lys Arg Leu Asn 
165 

Glu Pro Tyr Phe 

Asp Ala Lys Phe 

200 

Pro Ser Leu Gly 

215 

Phe Pro Asn Gly 
230 

Cys Asn His His 
245 

Pro Glu Gly Phe 

Glu Ser Gly Cys 

280 

His Phe Ala Asp 
295 

Phe Phe Leu Asn 
310 

Lys Val Thr Val 
325 

Asn Val Ala Leu 

Phe Lys Gly Thr 

360 

Val Glu Leu Asn 
375 

Arg Ser Gly He 
390 

Thr Val Gin Ser 



75 

Ser Asp Met Cys 
90 

Cys Val Asp Trp 
105 

Gin Asn He Arg 

Leu Ser Thr Ser 

140 

His Ser Leu Gly 
155 

Gly Leu Val Gly 
170 

Gin Asp Thr Pro 
185 

Val Asp Val He 

Phe Gly Met Ser 

220 

Gly Lys Asp Met 

235 

Arg Ser He Glu 

250 

Leu Gly Tyr Pro 
265 

Phe Pro Cys Pro 

Gin Tyr Pro Gly 

300 

Thr Gly Ala Ser 
315 

Thr Leu Ser Gly 
330 

Leu Gly Lys Asn 
345 

Leu Lys Pro Asp 

Val Gly Thr He 

380 

Ser Val Ser Lys 
395 

Gly Lys Asp Gly 



80 

Lys Asn Met Phe 
95 

Lys Gly Gly Ser 
110 

Val Val Gly Ala 
125 

Leu Asn Tyr Ala 

Ala His Thr Ala 

160 

Arg He Thr Gly 
175 

Glu Glu Val Arg 
190 

His Thr Asp He 

205 

Gin Lys Val Gly 

Pro Gly Cys Lys 

240 

Tyr Tyr His Ser 

255 

Cys Ala Ser Tyr 

270 

Ala Lys Gly Cys 
285 

Lys Thr Asn Ala 

Asp Asn Phe Thr 

320 

Glu Lys Asp Pro 
335 

Gly Asn Ser Ala 
350 

Ala Ser Tyr Thr 
365 

Gin Lys Val Thr 

Pro Lys Met Gly 

400 

Thr Lys Tyr Asn 
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405 410 415 

Phe Cys Ser S.er Asp lie Val Gin Glu Asn Val Glu Gin Thr Leu Ser 

420 425 430 

Pro Cys 



<210> 42 

<211>471 

<212> PRT 

<213> Escherichia coli 

<400> 42 

Met Lys Gin 
1 

Pro Val Thr 

Ala Ala Gin 

35 

Gly Asp Gin 
50 

Lys Asn lie 
65 

Thr Ala Ala 

lie Asp Ala 

Lys Lys Thr 
115 

Thr Ala Trp 

130 
Asp lie His 
145 

Ala Gly Leu 

Thr Pro Ala 

Pro Ser Ala 
195 

Gly Lys Gly 

210 
Thr Leu Gly 



Ser Thr lie Ala 
5 

Lys Ala Arg Thr 
20 

Gly Asp lie Thr 

Thr Ala Ala Leu 

55 

lie Leu Leu lie 
70 

Arg Asn Tyr Ala 
85 

Leu Pro Leu Thr 
100 

Gly Lys Pro Asp 

Ser Thr Gly Val 

135 

Glu Lys Asp His 
150 

Ala Thr Gly Asn 
165 

Ala Leu Val Ala 
180 

Thr Ser Glu Lys 

Ser He Thr Glu 

215 

Gly Gly Ala Lys 



Leu Ala Leu Leu Pro 
10 

Pro Glu Met Pro Val 

25 

Ala Pro Gly Gly Ala 
40 

Arg Asp Ser Leu Ser 

60 

Gly Asp Gly Met Gly 

75 

Glu Gly Ala Gly Gly 
90 

Gly Gin Tyr Thr His 
105 

Tyr Val Thr Asp Ser 
120 

Lys Thr Tyr Asn Gly 

140 

Pro Thr He Leu Glu 

155 

Val Ser Thr Ala Glu 
170 

His Val Thr Ser Arg 
185 

Cys Pro Gly Asn Ala 
200 

Gin Leu Leu Asn Ala 

220 

Thr Phe Ala Glu Thr 



Leu Leu Phe Thr 
15 

I«eu Glu Asn Arg 
30 

Arg Arg Leu Thr 
45 

Asp Lys Pro Ala 

Asp Ser Glu He 

80 

Phe Phe Lys Gly 
95 

Tyr Ala Leu Asn 
110 

Ala Ala Ser Ala 
125 

Ala Leu Gly Val 

Met Ala Lys Ala 

160 

Leu Gin Asp Ala 
175 

Lys Cys Tyr Gly 
190 

Leu Glu Lys Gly 
205 

Arg Ala Asp Val 
Ala Thr Ala Gly 
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10 



IS 



20 



225 230 235 240 

GIu Trp Gin Gly Lys Thr Leu Arg Glu Gin Ala Gin Ala Arg Gly Tyr 

245 250 255 

Gin Leu Val Ser Asp Ala Ala Ser Leu Asn Ser Val Thr Glu Ala Asn 

260 265 270 

Gin Gin Lys Pro Leu Leu Gly Leu Phe Ala Asp Gly Asn Met Pro Val 

275 280 285 

Arg Trp Leu Gly Pro Lys Ala Thr Tyr His Gly Asn lie Asp Lys Pro 

290 295 300 

Ala Val Thr Cys Thr Pro Asn Pro Gin Arg Asn Asp Ser Val Pro Thr 
305 310 315 320 

Leu Ala Gin Met Thr Asp Lys Ala lie Glu Leu Leu Ser Lys Asn Glu 

325 330 335 

Lys Gly Phe Phe Leu Gin Val Glu Gly Ala Ser lie Asp Lys Gin Asp 

340 345 350 

His Ala Ala Asn Pro Cys Gly Gin lie Gly Glu Thr Val Asp Leu Asp 

355 360 365 

Glu Ala Val Gin Arg Ala Leu Glu Phe Ala Lys Lys Glu Gly Asn Thr 

370 375 380 

Leu Val lie Val Thr Ala Asp His Ala His Ala Ser Gin lie Val Ala 
385 390 395 400 

Pro Asp Thr Lys Ala Pro Gly Leu Thr Gin Ala Leu Asn Thr Lys Asp 

405 410 415 

Gly Ala Val Met Val Met Ser Tyr Gly Asn Ser Glu Glu Asp Ser Gin 
35 420 425 430 

Glu His Thr Gly Ser Gin Leu Arg He Ala Ala Tyr Gly Pro His Ala 

435 440 445 

Ala Asn Val Val Gly Leu Thr Asp Gin Thr Asp Leu Phe Tyr Thr Met 

450 455 460 

Lys Ala Ala Leu Gly Leu Lys 
465 470 



25 



30 



40 



45 



50 



<210> 43 
<211>260 
<212> PRT 
<213> Bovine 

<400> 43 



55 



83 
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Leu Lys lie Ala Ala Phe Asn He Arg Thr Phe Gly Glu Thr Lys Met 

15 10 15 

Ser Asn Ala Thr Leu Ala Ser Tyr He Val Arg He Val Arg Arg Tyr 



20 

Asp He Val Leu 
35 

Gly Lys Leu Leu 
50 

Tyr Val Val Ser 
65 

Leu Phe Leu Phe 

Tyr Asp Asp Gly 

100 

Pro Ala Val VaX 
115 

Ala He Val Ala 
130 

Asn Ser Leu Tyr 
145 

Asn Asp Val Met 

Thr Ser Ser Gin 

180 

Gin Trp Leu He 
195 

Cys Ala Tyr Asp 
210 

Val Val Pro Gly 
225 

Leu Ser Asn Glu 

Val Thr Leu Thr 

260 



He Gin Glu Val 

40 

Asp Tyr Leu Asn 
55 

Glu Pro Leu Gly 
70 

Arg Pro Asn Lys 
85 

Cys Glu Ser Cys 

Lys Phe Ser Ser 

120 

Leu His Ser Ala 

135 

Asp Val Tyr Leu 
150 

Leu Met Gly Asp 
165 

Trp Ser Ser He 

Pro Asp Ser Ala 

200 

Arg He Val Val 
215 

Ser Ala Ala Pro 
230 

Met Ala Leu Ala 
245 



25 

Arg Asp Ser His 

Gin Asp Asp Pro 

60 

Arg Asn Ser Tyr 
75 

Val Ser Val Leu 
90 

Gly Asn Asp Ser 
105 

His Ser Thr Lys 

Pro Ser Asp Ala 

140 

Asp Val Gin Gin 
155 

Phe Asn Ala Asp 

170 

Arg Leu Arg Thr 
185 

Asp Thr Thr Ala 

Ala Gly Ser Leu 

220 

Phe Asp Phe Gin 
235 

He Ser Asp His 
250 



30 

Leu Val Ala Val 
45 

Asn Thr Tyr His 

Lys Glu Arg Tyr 

80 

Asp Thr Tyr Gin 
95 

Phe Ser Arg Glu 
110 

Val Lys Glu Phe 
125 

Val Ala Glu He 

Lys Trp His Leu 

160 

Cys Ser Tyr Val 
175 

Ser Ser Thr Phe 
190 

Thr Ser Thr Asn 
205 

Leu Gin Ser Ser 

Ala Ala Tyr Gly 

240 

Tyr Pro Val Glu 
255 



<210>44 
<211>686 
<212> PRT 

<213> Bacillus circulans 
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<400> 44 

Ala Pro Asp Thr Ser VaX Ser Asn Lys Gin Asn Phe Ser Thr Asp Val 

X 5 10 15 

lie Tyr Gin lie Phe Thr Asp Arg Phe Ser Asp Gly Asn Pro Ala Asn 



85 
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20 



25 



30 



10 



IS 



20 



25 



30 



35 



40 



45 



50 



Asn Pro Thr Gly Ala Ala Phe Asp Gly Thr Cys Thr Asn Leu Arg Leu 

35 40 45 

Tyr Cys Gly Gly Asp Trp Gin Gly lie lie Asn Lys He Asn Asp Gly 



Tyr Leu Thr Gly Met Gly Val Thr Ala He Trp He Ser Gin Pro Val 
65 70 75 80 

Glu Asn He Tyr Ser He He Asn Tyr Ser Gly Val Asn Asn Thr Ala 

85 90 95 

Tyr His Gly Tyr Trp Ala Arg Asp Phe Lys Lys Thr Asn Pro Ala Tyr 

100 105 110 

Gly Thr He Ala Asp Phe Gin Asn Leu He Ala Ala Ala His Ala Lys 

115 120 125 

Asn He Lys Val He He Asp Phe Ala Pro Asn His Thr Ser Pro Ala 

130 135 140 

Ser Ser Asp Gin Pro Ser Phe Ala Glu Asn Gly Arg Leu Tyr Asp Asn 
145 150 155 160 

Gly Thr Leu Leu Gly Gly Tyr Thr Asn Asp Thr Gin Asn Leu Phe His 

165 170 175 

His Asn Gly Gly Thr Asp Phe Ser Thr Thr Glu Asn Gly He Tyr Lys 

180 185 190 

Asn Leu Tyr Asp Leu Ala Asp Leu Asn His Asn Asn Ser Thr Val Asp 

195 200 205 

Val Tyr Leu Lys Asp Ala He Lys Met Trp Leu Asp Leu Gly He Asp 

210 215 220 

Gly He Arg Met Asp Ala Val Lys His Met Pro Phe Gly Trp Gin Lys 
225 230 235 240 

Ser Phe Met Ala Ala Val Asn Asn Tyr Lys Pro Val Phe Thr Phe Gly 

245 250 255 

Glu Trp Phe Leu Gly Val Asn Glu Val Ser Pro Glu Asn His Lys Phe 

260 265 270 

Ala Asn Glu Ser Gly Met Ser Leu Leu Asp Phe Arg Phe Ala Gin Lys 

275 280 285 

Val Arg Gin Val Phe Arg Asp Asn Thr Asp Asn Met Tyr Gly Leu Lys 

290 295 300 

Ala Met Leu Glu Gly Ser Ala Ala Asp Tyr Ala Gin Val Asp Asp Gin 
305 310 315 320 

Val Thr Phe He Asp Asn His Asp Met Glu Arg Phe His Ala Ser Asn 

325 330 335 

Ala Asn Arg Arg Lys Leu Glu Gin Ala Leu Ala Phe Thr Leu Thr Ser 

340 345 350 

Arg Gly Val Pro Ala He Tyr Tyr Gly Thr Glu Gin Tyr Met Ser Gly 
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355 360 365 

Giy Thr Asp Pro Asp Asn Arg Ala Arg lie Pro Ser Phe Ser Thr Ser 
5 370 375 380 

Thr Thr Ala Tyr Gin Val lie Gin Lys Leu Ala Pro Leu Arg Lys Cys 
385 390 395 400 

Asn Pro Ala lie Ala Tyr Gly Ser Thr Gin Glu Arg Trp lie Asn Asn 
10 405 410 415 

Asp Val Leu lie Tyr Glu Arg Lys Phe Gly Ser Asn Val Ala Val Val 

420 425 430 

Ala Val Asn Arg Asn Leu Asn Ala Pro Ala Ser lie Ser Gly Leu Val 
IS 435 440 445 

Thr Ser Leu Pro Gin Gly Ser Tyr Asn Asp Val Leu Gly Gly Leu Leu 

450 455 460 

Asn Gly Asn Thr Leu Ser Val Gly Ser Gly Gly Ala Ala Ser Asn Phe 
20 465 . 470 475 480 

Thr Leu Ala Ala Gly Gly Thr Ala Val Trp Gin Tyr Thr Ala Ala Thr 

485 490 495 

Ala Thr Pro Thr lie Gly His Val Gly Pro Met Met Ala Lys Pro Gly 
25 500 505 510 

Val Thr He Thr He Asp Gly Arg Gly Phe Gly Ser Ser Lys Gly Thr 

515 520 525 

Val Tyr Phe Gly Thr Thr Ala Val Ser Gly Ala Asp He Thr Ser Trp 
30 530 535 540 

Glu Asp Thr Gin He Lys Val Lys He Pro Ala Val Ala Gly Gly Asn 
545 550 555 560 

Tyr Asn He Lys Val Ala Asn Ala Ala Gly Thr Ala Ser Asn Val Tyr 
35 565 570 575 

Asp Asn Phe Glu Val Leu Ser Gly Asp Gin Val Ser Val Arg Phe Val 

580 585 590 

Val Asn Asn Ala The Thr Ala Leu Gly Gin Asn Val Tyc Leu Thr Gly 

595 600 605 

Ser Val Ser Glu Leu Gly Asn Trp Asp Pro Ala Lys Ala He Gly Pro 

610 615 620 

Met Tyr Asn Gin Val Val Tyr Gin Tyr Pro Asn Trp Tyr Tyr Asp Val 
625 630 635 640 

Ser Val Pro Ala Gly Lys Thr He Glu Phe Lys Phe Leu Lys Lys Gin 

645 650 655 

Gly Ser Thr Val Thr Trp Glu Gly Gly Ser Asn His Thr Phe Thr Ala 

660 665 670 

Pro Ser Ser Gly Thr Ala Thr He Asn Val Asn Trp Gin Pro 
675 680 685 

55 
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<210> 45 
<211>404 
<212> PRT 

<213> Amycolatopsis orientalis 

5 

<400> 45 




IS 



20 



30 



Met Arg Val Leu lie Thr Gly Cys Giy Ser Arg Gly Asp Thr Glu Pro 
to 1 5 10 15 

Leu Val Ala Leu Ala Ala Arg Leu Arg Glu Leu Gly Ala Asp Ala Arg 

20 25 30 

Met Cys Leu Pro Pro Asp Tyr Val Glu Arg Cys Ala Glu Val Gly Val 

3S 40 45 

Pro Met Val Pro Val Gly Arg Ala Val Arg Ala Gly Ala Arg Glu Pro 

50 55 60 

Gly Glu Leu Pro Pro Gly Ala Ala Glu Val Val Thr Glu Val Val Ala 
65 70 75 80 

Glu Trp Phe Asp Lys Val Pro Ala Ala lie Glu Gly Cys Asp Ala Val 

85 90 95 

25 Val Thr Thr Gly Leu Leu Pro Ala Ala Val Ala Val Arg Ser Met Ala 

100 105 110 

Glu Lys Leu Gly lie Pro Tyr Arg Tyr Thr Val Leu Ser Pro Asp His 

115 120 125 

Leu Pro Ser GXu Gin Ser Gin Ala Glu Arg Asp Met Tyr Asn Gin Gly 

130 135 140 

Ala Asp Arg Leu Phe Gly Asp Ala Val Asn Ser His Arg Ala Ser lie 
145 150 155 160 

Gly Leu Pro Pro Val Glu His Leu Tyr Asp Tyr Gly Tyr Thr Asp Gin 

165 170 175 

Pro Trp Leu Ala Ala Asp Pro Val Leu Ser Pro Leu Arg Pro Thr Asp 

180 185 190 

Leu Gly Thr Val Gin Thr Gly Ala Trp lie Leu Pro Asp Glu Arg Pro 

195 200 205 

Leu Ser Ala Glu Leu Glu Ala Phe Leu Ala Ala Gly Ser Thr Pro Val 
45 210 215 220 

Tyr Val Gly Phe Gly Ser Ser Ser Arg Pro Ala Thr Ala Asp Ala Ala 
225 230 235 240 

Lys Met Ala lie Lys Ala Val Arg Ala Ser Gly Arg Arg lie Val Leu 

245 250 255 

Ser Arg Gly Trp Ala Asp Leu Val Leu Pro Asp Asp Gly Ala Asp Cys 

260 265 270 

„ Phe Val Val Gly Glu Val Asn Leu Gin Glu Leu Phe Gly Arg Val Ala 

99 
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275 280 265 

^ Ala Ala lie His His Asp Ser Ala Gly Thr Thr Leu Leu Ala Met Arg 

290 295 300 

Ala Gly He Pro Gin He Val Val Arg Arg Val Val Asp Asn Val Val 
305 310 315 320 

10 Glu Gin Ala Tyr His Ala Asp Arg Val Ala Glu Leu Gly Val Gly Val 

325 330 335 

Ala Val Asp Gly Pro Val Pro Thr He Asp Ser Leu Ser Ala Ala Leu 

340 345 350 

Asp Thr Ala Leu Ala Pro Glu He Arg Ala Arg Ala Thr Thr Val Ala 

355 360 365 

Asp Thr He Arg Ala Asp Gly Thr Thr Val Ala Ala Gin Leu Leu Phe 

370 375 380 

Asp Ala Val Ser Leu Glu Lys Pro Thr Val Pro Ala Leu Glu His His 
385 390 395 400 

His His His His 
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<210> 46 
<211>292 
<212> PRT 

<213> Pseudomonas sp. 
<400> 46 



Ser He Glu Arg Leu Gly Tyr Leu Gly Phe Ala Val Lys Asp Val Pro 
15 10 15 

Ala Trp Asp His Phe Leu Thr Lys Ser Val Gly Leu Met Ala Ala Gly 

20 25 30 

Ser Ala Gly Asp Ala Ala Leu Tyr Arg Ala Asp Gin Arg Ala Trp Arg 

35 40 45 

He Ala Val Gin Pro Gly Glu Leu Asp Asp Leu Ala Tyr Ala Gly Leu 
50 55 60 

45 Glu Val Asp Asp Ala Ala Ala Leu Glu Arg Met Ala Asp Lys Leu Arg 

65 70 75 80 

Gin Ala Gly Val Ala Phe Thr Arg Gly Asp Glu Ala Leu Met Gin Gin 

85 90 95 

Arg Lys Val Met Gly Leu Leu Cys Leu Gin Asp Pro Phe Gly I^eu Pro 

100 105 110 

Leu Glu He Tyr Tyr Gly Pro Ala Glu He Phe His Glu Pro Phe Leu 

115 120 125 

Pro Ser Ala Pro Val Ser Gly Phe Val Thr Gly Asp Gin Gly He Gly 
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10 



IS 



20 



2S 



30 



40 



SO 



SS 



130 135 140 

His Phe Val Arg Cys Val Pro Asp Thr Ala Lys Ala Met Ala Phe Tyr 
145 150 155 160 

Thr Glu Val Leu Gly Phe Val Leu Ser Asp lie lie Asp lie Gin Met 

165 170 175 

Gly Pro Glu Thr Ser Val Pro Ala His Phe Leu His Cys Asn Gly Arg 

180 185 190 

His His Thr lie Ala Leu Ala Ala Phe Pro He Pro Lys Arg He His 

195 200 205 

His Phe Met Leu Gin Ala Asn Thr He Asp Asp Val Gly Tyr Ala Phe 

210 215 220 

Asp Arg Leu Asp Ala Ala Gly Arg He Thr Ser Leu Leu Gly Arg His 
225 230 235 240 

Thr Asn Asp Gin Thr Leu Ser Phe Tyr Ala Asp Thr Pro Ser Pro Met 

245 250 255 

He Glu Val Glu Phe Gly Trp Gly Pro Arg Thr Val Asp Ser Ser Trp 

260 265 270 

Thr Val Ala Arg His Ser Arg Thr Ala Met Trp Gly His Lys Ser Val 

275 280 285 

Arg Gly Gin Arg 
290 



<210> 47 
<211>311 
<212> PRT 
35 <21 3> Acitenobacter sp. 

<400> 47 



Met Glu Val Lys He Phe Asn Thr Gin Asp Val Gin Asp Phe Leu Arg 
15 10 15 

Val Ala Ser Gly Leu Glu Gin Glu Gly Gly Asn Pro Arg Val Lys Gin 

20 25 30 

4S He He His Arg Val Leu Ser Asp Leu Tyr Lys Ala He Glu Asp Leu 

35 40 45 

Asn He Thr Ser Asp Glu Tyr Trp Ala Gly Val Ala Tyr Leu Asn Gin 

50 55 60 

Leu Gly Ala Asn Gin Glu Ala Gly Leu Leu Ser Pro Gly Leu Gly Phe 

65 70 75 80 

Asp His Tyr Leu Asp Met Arg Met Asp Ala Glu Asp Ala Ala Leu Gly 

85 90 95 

He Glu Asn Ala Thr Pro Arg Thr He Glu Gly Pro Leu Tyr Val Ala 
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100 

Gly Ala Pro Glu Ser 
115 

Pro Asn Gly His Thr 
130 

Gly Lys Pro Leu Pro 
145 

Lys Gly Phe Tyr Ser 

165 

Asn Met Arg Arg Ser 

180 

Arg Thr lie Leu Pro 
195 

Gin Gin Leu Leu Asn 
210 

lie His Tyr Phe Val 
225 

He Asn Val Ala Gly 

245 

Thr Arg Glu Gly Leu 

260 

Ala He Lys Ala Asn 
275 

Asp Leu Lys Leu Thr 
290 

Asp Arg Pro Arg Leu 

305 



105 

Val Gly Tyr Ala Arg Met 
120 

Leu lie Leu His Gly Thr 
135 

Asn Ala Lys Val Glu lie 
150 155 
His Phe Asp Pro Thr Gly 

170 

He He Thr Asp Glu Asn 

185 

Ala Gly Tyr Gly Cys Pro 
200 

Gin Leu Gly Arg His Gly 
215 

Ser Ala Asp Gly His Arg 
230 235 
Asp Pro Tyr Thr Tyr Asp 

250 

Val Val Asp Ala Val Glu 

265 

Asp Val Glu Gly Pro Phe 
280 

Arg Leu Val Asp Gly Val 

295 
Ala Val 
310 



110 

Asp Asp Gly Ser Asp 
125 

He Phe Asp Ala Asp 
140 

Trp His Ala Asn Thr 

160 

Glu Gin Gin Ala Phe 

175 

Gly Gin Tyr Arg Val 
190 

Pro Glu Gly Pro Thr 
205 

Asn Arg Pro Ala His 
220 

Lys Leu Thr Thr Gin 

240 

Asp Phe Ala Tyr Ala 

255 

His Thr Asp Pro Glu 
270 

Ala Glu Met Val Phe 
285 

Asp Asn Gin Val Val 
300 



<210> 48 
<211>414 
<212> PRT 

<213> Pseudomonas putida 
<400> 48 
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10 



Thr Thr Glu Thr He Gin Ser Asn Ma Asn Leu Ala Pro Leu Pro Pro 

15 10 15 

His Val Pro Glu His Leu Val Phe Asp Phe Asp Met Tyr Asn Pro Ser 

20 25 30 

Asn Leu Ser Ala Gly Val Gin Glu Ala Trp Ala Val Leu Gin Glu Ser 

35 40 45 

Asn Val Pro Asp Leu Val Trp Thr Arg Cys Asn Gly Gly His Trp He 
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Ala 
Phe 
Asp 

10 

Ala 
Asn 

15 

Gin 
145 
Arg 

20 

Leu 
Phe 

25 

Glu 
Asn 

30 225 

Met 

Leu 

35 

Glu 
Leu 

40 

Tyr 
305 
Pro 

45 

His 
Gly 

50 

Val 
Pro 

55 



50 

Thr Arg Gly Gin Leu 

70 

Ser Ser Glu Cys Pro 

85 

Phe He Pro Thr Ser 
100 

Leu Ala Asn Gin Val 
115 

Arg He Gin Glu Leu 
130 

Gly Gin Cys Asn Phe 

ISO 

He Phe Met Leu Leu 

165 

Lys Tyr Leu Thr Asp 
180 

Ala Glu Ala Lys Glu 

195 

Gin Arg Arg Gin Lys 
210 

Gly Gin Val Asn Gly 

230 

Cys Gly Leu Leu Leu 

245 

Ser Phe Ser Met Glu 
260 

Leu He Gin Arg Pro 
275 

Arg Arg Phe Ser Leu 
290 

Glu Phe His Gly Val 

310 

Gin Met Leu Ser Gly 

325 

Val Asp Phe Ser Arg 
340 

Ser His Leu Cys Leu 
355 

Thr Leu Lys Glu Trp 
370 

Gly Ala Gin He Gin 



55 

He Arg Glu Ala Tyr 

75 

Phe He Pro Arg Glu 

90 

Met Asp Pro Pro Glu 
105 

Val Gly Met Pro Val 
120 

Ala Cys Ser Leu He 
135 

Thr Glu Asp Tyr Ala 

155 

Ala Gly Leu Pro Glu 

170 

Gin Met Thr Arg Pro 
185 

Ala Leu Tyr Asp Tyr 
200 

Pro Gly Thr Asp Ala 

215 

Arg Pro He Thr Ser 

235 

Val Gly Gly Leu Asp 

250 

Phe Leu Ala Lys Ser 
265 

Glu Arg He Pro Ala 
280 

Val Ala Asp Gly Arg 
295 

Gin Leu Lys Lys Gly 

315 

Leu Asp Glu Arg Glu 

330 

Gin Lys Val Ser His 
345 

Gly Gin His Leu Ala 
360 

Leu Thr Arg He Pro 
375 

His Lys Ser Gly He 



60 

Glu Asp Tyr Arg His 

80 

Ala Gly Glu Ala Tyr 

95 

Gin Arg Gin Phe Arg 
110 

Val Asp Lys Leu Glu 
125 

Glu Ser Leu Arg Pro 
140 

Glu Pro Phe Pro He 

160 

Glu Asp He Pro His 

175 

Asp Gly Ser Met Thr 
190 

Leu He Pro He He 

205 

He Ser He Val Ala 

220 

Asp Glu Ala Lys Arg 

240 

Thr Val val Asn Phe 

255 

Pro Glu His Arg Gin 
270 

Ala Cys Glu Glu Leu 
285 

He Leu Thr Ser Asp 
300 

Asp Gin He Leu Leu 

320 

Asn Ala Cys Pro Met 

335 

Thr Thr Phe Gly His 
350 

Arg Arg Glu He He 
365 

Asp Phe Ser He Ala 
380 

Val Ser Gly Val Gin 
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385 390 39$ 400 

Ala Leu Pro Leu Val Trp Asp Pro Ala Thr Thr Lys Ala Val 
5 405 410 

<210> 49 
<211> 374 
<212> PRT 
10 <213> Equus caballus 

<400> 49 



IS 



25 



30 



Ser Thr Ala Gly Lys Val He Lys Cys Lys Ala Ala Val Leu Trp Glu 
15 10 15 

Glu Lys Lys Pro Phe Ser He Glu Glu Val Glu Val Ala Pro Pro Lys 

20 25 30 

20 Ala His Glu Val Arg He Lys Met Val Ala Thr Gly lie Cys Arg Ser 

35 40 45 

Asp Asp His Val Val Ser Gly Thr Leu Val Thr Pro Leu Pro Val He 

50 55 60 

Ala Gly His Glu Ala Ala Gly He Val Glu Ser He Gly Glu Gly Val 
65 70 75 80 

Thr Thr Val Arg Pro Gly Asp Lys Val He Pro Leu Phe Thr Pro Gin 

85 90 95 

Cys Gly Lys Cys Arg Val Cys Lys His Pro Glu Gly Asn Phe Cys Leu 

100 105 110 

Lys Asn Asp Leu Ser Met Pro Arg Gly Thr Met Gin Asp Gly Thr Ser 
35 115 120 125 

Arg Phe Thr Cys Arg Gly Lys Pro He His His Phe Leu Gly Thr Ser 

130 135 140 

Thr Phe Ser Gin Tyr Thr Val Val Asp Glu He Ser Val Ala Lys He 
^ 145 150 155 160 

Asp Ala Ala Ser Pro Leu Glu Lys Val Cys Leu He Gly Cys Gly Phe 

165 170 175 

Ser Thr Gly Tyr Gly Ser Ala Val Lys Val Ala Lys Val Thr Gin Gly 

180 185 190 

Ser Thr Cys Ala Val Phe Gly Leu Gly Gly Val Gly Leu Ser Val He 

195 200 205 

Met Gly Cys Lys Ala Ala Gly Ala Ala Arg He He Gly Val Asp He 

210 215 220 

Asn Lys Asp Lys Phe Ala Lys Ala Lys Glu Val Gly Ala Thr Glu Cys 
225 230 235 240 

Val Asn Pro Gin Asp Tyr Lys Lys Pro He Gin Glu Val Leu Thr Glu 
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245 250 255 

Met Ser Asn Gly Gly Val Asp Phe Ser Phe Glu Vai lie Gly Arg Leu 
5 260 265 270 

Asp Thr Met Val Thr Ala Leu Ser Cys Cys Gin Glu Ala Tyr Gly Val 

275 280 285 

Ser Val He Val Gly Val Pro Pro Asp Ser Gin Asn Leu Ser Met Asn 

290 295 300 

Pro Met Leu Leu Leu Ser Gly Arg Thr Trp Lys Gly Ala He Phe Gly 
305 310 315 320 

Gly Phe Lys Ser Lys Asp Ser Val Pro Lys Leu Val Ala Asp Phe Met 

325 330 335 

Ala Lys Lys Phe Ala Leu Asp Pro Leu He Thr His Val Leu Pro Phe 

340 345 350 

Glu Lys lie Asn Glu Gly Phe Asp Leu Leu Arg Ser Gly Glu Ser He 

355 360 365 

Arg Thr He Leu Thr Phe 
370 



IS 



20 



25 



30 



40 



45 



<210> 50 

<211>297 

<212> PRT 

<213> Escherichia coli 

<400> 50 



Met Ala Thr Asn Leu Arg Gly Val Met Ala Ala Leu Leu Thr Pro Phe 
35 1 5 10 15 

Asp Gin Gin Gin Ala Leu Asp Lys Ala Ser Leu Arg Arg Leu Val Gin 

20 25 30 

Phe Asn He Gin Gin Gly He Asp Gly Leu Tyr Val Gly Gly Ser Thr 

35 40 45 

Gly Glu Ala Phe Val Gin Ser Leu Ser Glu Arg Glu Gin Val Leu Glu 

50 55 60 

He Val Ala Glu Glu Gly Lys Gly Lys He Lys Leu He Ala His Val 
65 70 75 80 

Gly Cys Val Thr Thr Ala Glu Ser Gin Gin Leu Ala Ala Ser Ala Lys 

85 90 95 

50 Arg Tyr Gly Phe Asp Ala Val Ser Ala Val Thr Pro Phe Tyr Tyr Pro 

100 105 110 

Phe Ser Phe Glu Glu His Cys Asp His Tyr Arg Ala He He Asp Ser 

115 120 125 

Ala Asp Gly Leu Pro Met Val Val Tyr Asn He Pro Ala Leu Ser Gly 
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10 



IS 



20 



25 



30 



SO 



55 



130 135 140 

Val Lys Leu Thr Leu Asp Gin lie Asn Thr Leu Val Thr Leu Pro Gly 
145 150 155 160 

Val Gly Ala Leu Lys Gin Thr Ser Gly Asp Leu Tyr Gin Met Glu Gin 

165 170 175 

lie Arg Arg Glu His Pro Asp Leu Val Leu Tyr Asn Gly Tyr Asp Glu 

180 185 190 

He Phe Ala Ser Gly Leu Leu Ala Gly Ala Asp Gly Gly He Gly Ser 

195 200 205 

Thr Tyr Asn He Met Gly Trp Arg Tyr Gin Gly He Val Lys Ala Leu 

210 215 220 

Lys Glu Gly Asp He Gin Thr Ala Gin Lys Leu Gin Thr Glu Cys Asn 
225 230 235 240 

Lys Val He Asp Leu Leu He Lys Thr Gly Val Phe Arg Gly Leu Lys 

245 250 255 

Thr Val Leu His Tyr Met Asp Val Val Ser Val Pro Leu Cys Arg Lys 

260 265 270 

Pro Phe Gly Pro Val Asp Glu Lys Tyr Leu Pro Glu Leu Lys Ala Leu 

275 280 285 

Ala Gin Gin Leu Met Gin Glu Arg Gly 
290 295 



<210> 51 
<211>268 
35 <212>PRT 

<213> Salmonella typhimurium 

<400> 51 

40 

Met Glu Arg Tyr Glu Asn Leu Phe Ala Gin Leu Asn Asp Arg Arg Glu 
15 10 15 

Gly Ala Phe Val Pro Phe Val Thr Leu Gly Asp Pro Gly He Glu Gin 
45 2 0 2 5 30 

Ser Leu Lys He He Asp Thr Leu He Asp Ala Gly Ala Asp Ala Leu 

35 40 45 

Glu Leu Gly Val Pro Phe Ser Asp Pro Leu Ala Asp Gly Pro Thr He 

50 55 60 

Gin Asn Ala Asn Leu Arg Ala Phe Ala Ala Gly Val Thr Pro Ala Gin 
65 70 75 80 

Cys Phe Glu Met Leu Ala Leu He Arg Glu Lys His Pro Thr He Pro 

85 90 95 

He Gly Leu Leu Met Tyr Ala Asn Leu Val Phe Asn Asn Gly He Asp 
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10 



100 ICS 110 

Ala Phe Tyr Ala Arg Cys Glu Gin Val Gly Val Asp Ser Val Leu Val 

lis 120 125 

Ala Asp Val Pro Val Glu Glu Ser Ala Pro Phe Arg Gin Ala Ala Leu 

130 135 140 

Arg His Asn lie Ala Pro lie Phe lie Cys Pro Pro Asn Ala Asp Asp 
145 150 155 160 

Asp Leu Leu Arg Gin Val Ala Ser Tyr Gly Arg Gly Tyr Thr Tyr Leu 

165 170 175 

IS Leu Ser Arg Ser Gly Val Thr Gly Ala Glu Asn Arg Gly Ala Leu Pro 

180 185 190 

Leu His His Leu He Glu Lys Leu Lys Glu Tyr His Ala Ala Pro Ala 

195 200 205 

Leu Gin Gly Phe Gly He Ser Ser Pro Glu Gin Val Ser Ala Ala Val 

210 215 220 

Arg Ala Gly Ala Ala Gly Ala He Ser Gly Ser Ala He Val Lys He 
225 230 235 240 

He Glu Lys Asn Leu Ala Ser Pro Lys Gin Met Leu Ala Glu Leu Arg 

245 250 255 

Ser Phe Val Ser Ala Met Lys Ala Ala Ser Arg Ala 
30 260 265 



<210>52 
<211>393 
3S <212>PRT 

<213> Actinoplanes missouriensis 



20 



2S 



40 



45 



ss 



<400> 52 



Ser Val Gin Ala Thr Arg Glu Asp Lys Phe Ser Phe Gly Leu Trp Thr 
15 10 15 

Val Gly Trp Gin Ala Arg Asp AXa Phe Gly Asp Ala Thr Arg Thr Ala 

20 25 30 

Leu Asp Pro Val Glu Ala Val His Lys Leu Ala Glu He Gly Ala Tyr 

35 40 45 

Gly He Thr Phe His Asp Asp Asp Leu Val Pro Phe Gly Ser Asp Ala 
so 50 55 60 

Gin Thr Arg Asp Gly He He Ala Gly Phe Lys Lys Ala Leu Asp Glu 
65 70 75 80 

Thr Gly Leu He Val Pro Met Val Thr Thr Asn Leu Phe Thr His Pro 

85 90 95 

Val Phe Lys Asp Gly Gly Phe Thr Ser Asn Asp Arg Ser Val Arg Arg 
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10 



IS 



20 



100 105 110 

Tyr Ala lie Arg Lys Val Leu Arg Gin Met Asp Leu Gly Ala Glu Leu 

lis 120 125 

Gly Ala Lys Thr Leu Val Leu Trp Gly Gly Arg Glu Gly Ala Glu Tyr 

130 135 140 

Asp Ser Ala Lys Asp Val Ser Ala Ala Leu Asp Arg Tyr Arg Glu Ala 
145 150 155 160 

Leu Asn Leu Leu Ala Gin Tyr Ser Glu Asp Arg Gly Tyr Gly Leu Arg 

165 170 175 

Phe Ala lie Glu Pro Lys Pro Asn Glu Pro Arg Gly Asp lie Leu Leu 

180 185 190 

Pro Thr Ala Gly His Ala lie Ala Phe Val Gin Glu Leu Glu Arg Pro 

195 200 205 

Glu Leu Phe Gly lie Asn Pro Glu Thr Gly Asn Glu Gin Met Ser Asn 

210 215 220 

Leu Asn Phe Thr Gin Gly lie Ala Gin Ala Leu Trp His Lys Lys Leu 
225 230 235 240 

2S Phe His He Asp Leu Asn Gly Gin His Gly Pro Lys Phe Asp Gin Asp 

245 250 255 

Leu Val Phe Gly His Gly Asp Leu Leu Asn Ala Phe Ser Leu Val Asp 

260 265 270 

Leu Leu Glu Asn Gly Pro Asp Gly Ala Pro Ala Tyr Asp Gly Pro Arg 

275 280 285 

His Phe Asp Tyr Lys Pro Ser Arg Thr Glu Asp Tyr Asp Gly Val Trp 

290 295 300 

Glu Ser Ala Lys Ala Asn He Arg Met Tyr Leu Leu Leu Lys Glu Arg 
305 310 315 320 

Ala Lys Ala Phe Arg Ala Asp Pro Glu Val Gin Glu Ala Leu Ala Ala 
40 325 330 335 

Ser Lys Val Ala Glu Leu Lys Thr Pro Thr Leu Asn Pro Gly Glu Gly 

340 345 350 

Tyr Ala Glu Leu Leu Ala Asp Arg Ser Ala Phe Glu Asp Tyr Asp Ala 

355 360 365 

Asp Ala Val Gly Ala Lys Gly Phe Gly Phe Val Lys Leu Asn Gin Leu 

370 375 380 

Ala He Glu His Leu Leu Gly Ala Arg 
385 390 



30 



35 



45 



50 



<210> 53 
SS <211>348 
<212> PRT 

<213> Bacteriophage T7 
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10 



IS 



<400> 53 

Val Asn lie hys Thr Asn Pro Phe Lys Ala Val Ser Phe Val Glu Ser 
15 10 IS 

Ala lie Lys Lys Ala Leu Asp Asn Ala Gly Tyr Leu lie Ala Glu lie 

20 25 30 

Lys Tyr Asp Gly Val Arg Gly Asn lie Cys Val Asp Asn Thr Ala Asn 

35 40 45 

Ser Tyr Trp Leu Ser Arg Val Ser Lys Thr lie Pro Ala Leu Glu His 

50 55 60 

Leu Asn Gly Phe Asp Val Arg Trp Lys Arg Leu Leu Asn Asp Asp Arg 
65 70 75 80 

Cys Phe Tyr Lys Asp Gly Phe Met Leu Asp Gly Glu Leu Met Val Lys 

85 90 95 

20 Gly Val Asp Phe Asn Thr Gly Ser Gly Leu Leu Arg Thr Lys Trp Thr 

100 105 110 

Asp Thr Lys Asn Gin Glu Phe His Glu Glu Leu Phe Val Glu Pro lie 
lis 120 125 

^ Arg Lys Lys Asp Lys Val Pro Phe Lys Leu His Thr Gly His Leu His 

130 135 140 

lie Lys Leu Tyr Ala lie Leu Pro Leu His lie Val Glu Ser Gly Glu 
145 150 155 160 

Asp Cys Asp Val Met Thr Leu Leu Met Gin Glu His Val Lys Asn Met 

165 170 175 

Leu Pro Leu heu Gin Glu Tyr Phe Pro Glu He Glu Trp Gin Ala Ala 

180 185 190 

Glu Ser Tyr Glu Val Tyr Asp Met Val Glu Leu Gin Gin Leu Tyr Glu 

195 200 205 

Gin Lys Arg Ala Glu Gly His Glu Gly Leu He Val Lys Asp Pro Met 
40 210 215 220 

Cys He Tyr Lys Arg Gly Lys Lys Ser Gly Trp Trp Lys Met Lys Pro 
225 230 235 240 

Glu Asn Glu Ala Asp Gly He He Gin Gly Leu Val Trp Gly Thr Lys 

245 250 255 

Gly Leu Ala Asn Glu Gly Lys Val He Gly Phe Glu Val Leu Leu Glu 

260 265 270 

Ser Gly Arg Leu Val Asn Ala Thr Asn He Ser Arg Ala Leu Met Asp 

275 280 285 

Glu Phe Thr Glu Thr Val Lys Glu Ala Thr Leu Ser Gin Trp Gly Phe 
290 295 300 

55 Phe Ser Pro Tyr Gly He Gly Asp Asn Asp Ala Cys Thr He Asn Pro 
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305 310 315 320 

Tyr Asp Gly Trp Ala Cys Gin lie Ser Tyr Met Glu Glu Thr Pro Asp 
^ 325 330 335 

Gly Ser Leu Arg His Pro Ser Phe Val Met Phe Arg 

340 345 

10 

<210> 54 
<211>42 
<212> DNA 

'<213> artificial sequence 
IS <220> 

<223> binding site for restrl and restr2 

<220> 

<221>CDS 

<222> (2).. (40) 
20 <223> 

<400>54 



25 g gtg gta tea gca ggc cac tgc tac aag tec cgc ate cag gt 42 

Val Val Ser Ala Gly His Cys Tyr Lys Ser Arg He Gin 
15 10 

30 <210>55 
<211> 13 
<212> PRT 

<213> artificial sequence 
<220> 

35 <223> binding site for restrl and restr2 

<400> 55 



40 Val Val Ser Ala Gly His Cys Tyr Lys Ser Arg lie Gin 

15 10 



<210> 56 
45 <211>42 

<212> DNA 

<213> artificial sequence 
<220> 

<223> forward primer restrl 

so 

<400> 56 

ggtggtatcc gcgggccact gctacaagtc ccggatccag gt 42 

<210> 57 
ss <211>42 

<212> DNA 

<213> artifldal sequence 
<220> 
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<223> reverse primer restr2 
<400> 67 

acctggatcc gggacttgta gcagtggccc gcggatacca cc 42 

<210> 58 
<211>50 
<212> DNA 

<213> artificial sequence 
<220> 

<223> binding site for restr3 and restr4 

<220> 

<221>CDS 

<222> (3)..(50) 

<223> 

<400>58 

cc act ggc acg aag tgc etc ate tct ggc tgg ggc aac act gcg age 
Thr Gly Thr Lys Cys Leu lie Ser Gly Trp Gly Asn Thr Ala Ser 
15 10 15 

tct 
Ser 

<210> 59 
<211> 16 
<212> PRT 

<213> artifidat sequence 
<220> 

<223> binding site for restr3 and restr4 
<400> 59 

Thr Gly Thr Lys Cys Leu lie Ser Gly Trp Gly Asn Thr Ala Ser Ser 
15 10 15 

<210> 60 
<211>50 
<212> DNA 

<213> artificial sequence 
<220> 

<223> forward primer restrS 
<400> 60 

ccactggcac gaagtgcctc atctctggct ggggcaacac tgcgagctct 50 

<210>61 
<211>50 
<212> DNA 

<213> artificial sequence 
<220> 

<223> reverse primer restr4 
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<400> 61 

agagctagca gtgttgcccc agccagagat gaggcacttg gtaccagtgg 50 

<210> 62 
5 <211>30 

<212> DNA 

<213> artificial sequence 
<220> 

<223> primer puc-fonvard 

10 

<400> 62 

ggggtacccc accaccatga atccactcct 30 

<210>63 
15 <211>30 

<212> DNA 

<213> artificial sequence 
<220> 

<223> primer puc^reverse 

20 

<400> 63 

cgggatccgg tatagagact gaagagatac 30 

<210>64 
25 <211>39 

<212> DNA 

<21 3> artificial sequence 
<220> 

<223> oligox-SDR1f 
30 <220> 

<221> misc_feature 
<222> (14)..(31) 
<223> any nucleotide 
<220> 

35 <221> misc_feature 

<222> {14)..(31) 

<223> any nucleotide or amino acid residue 
<220> 
<221> CDS 
40 <222> (2).. (37) 

<223> 

<400>64 

45 

g ggc cac tgc tac nnn nnn nnn nnn nnn nnn aag tec eg 39 
Gly His Cys Tyr Xaa Xaa Xaa Xaa Xaa Xaa Lys Ser 
IS 10 

so 

<210> 65 
<211> 12 
<212> PRT 

<213> artificial sequence 
55 <220> 

<221> misc_feature 
<222> (5).. (5) 

<223> The 'Xaa' at location 5 stands for Lys, Asn, Arg. Ser, Thr. lie. Met. Glu. Asp. Gly. Ala. Val, Gin. His, Pro, Leu, 
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a stop codon, T yr. Tip, Cys, or Phe. 
<220> 

<221> misc_feature 
<222> (6)..(6) 

<223> The 'Xaa' at location 6 stands for Lys, Asn, Arg, Ser, Thr, lie. Met, Glu. Asp, Gly, Ala. Val, Gin, His. Pro, Leu, 
a stop codon. Tyr, Trp, Cys. or Phe. 

<220> 

<221> misc^feature 
<222> (7)..(7) 

<223> The 'Xaa' at location 7 stands for Lys, Asn. Arg. Ser, Thr, lie. Met, Glu, Asp, Gly, Ala. Val, Gin, His. Pro, Leu. 
a stop codon, Tyr, Trp. Cys, or Phe. 

<220> 

<221> misc.feature 
<222> (8) . (8) 

<223> The 'Xaa' at location 8 stands for Lys, Asn, Arg. Ser, Thr, lie. Met, Glu, Asp, Gly, Ala, Val, Gin, His, Pro, Leu. 
a stop codon, Tyr, Trp, Cys, or Phe. 

<220> 

<221> misc_feature 
<222> (9)..(9) 

<223> The 'Xaa' at location 96 stands for Lys, Asn, Arg, Ser, Thr. lie. Met. Glu. Asp, Gly, Ala, Val, Gin, His, Pro, 
Leu. a stop codon, Tyr, Trp, Cys, or Phe. 

<220> 

<221> misc_feature 
<222> (10).. (10) 

<223> The 'Xaa* at location 10 stands for Lys, Asn, Arg, Ser. Thr, lie. Met, Glu, Asp, Gly. Ala. Val. Gin. His, Pro. 
Leu, a stop codon, Tyr, Trp, Cys, or Phe. 

<220> 

<223> oligox-SDR1f 
<220> 

<221> misc_feature 
<222> (14) (14) 
<223> any nucleotide 

<220> 

<221> misc_feature 
<222>(14)..(31) 

<223> any nucleotide or amino acid residue 
<400> 65 

Gly His Cys Tyr Xaa Xaa Xaa Xaa Xaa Xaa Lys Ser 
15 10 

<210> 66 
<211>45 
<212> DNA 

<213> artificial sequence 
<220> 

<223> oligox-SDR1r 
<220> 
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<221> misc_feature 
<222> (16)..(33) 
<223> any nucleotide 

5 <400> 66 

cgcccggtga cgatgnnnnn nnnnnnnnnn nnnttcaggg cctag 45 

<210>67 
<211>47 
10 <212> DNA 

<213> artificial sequence 
<220> 

<223> ollgox-SDR2f 
<220> 
15 <221>CDS 

<222> (2).. (96) 

<223> 

<220> 

<221 > misc.feature 
20 <222> {29)..(43) 

<223> any nucleotide or amino acid residue 

<400> 67 

25 

c aag tgc etc ate tet gge tgg gge aac nnn nnn nnn nnn nnn act g 47 
Lys Cys Leu He Ser Gly Trp Gly Asn Xaa Xaa Xaa Xaa Xaa Thr 
1.5 10 15 

30 

<210> 68 
<211> 15 
<212> PRT 

<213> artificial sequence 
35 <220> 

<221> misc_feature 
<222> (10)..(10) 

<223> The 'Xaa' at location 10 stands for Lys, Asn, Arg, Ser, Thr, lie. Met, Glu, Asp, Gly, Ala, Val, Gin, His, Pro, 
40 Leu, a stop codon, Tyr, Trp, Cys, or Phe. 

<220> 

<221 > misc_feature 
<222> (11).. (11) 

45 <223> The Xaa' at location 1 1 stands for Lys. Asn, Arg, Ser. Thr, lie. Met, Glu, Asp, Gly. Ala, Val. Gin, His. Pro, 

Leu. a stop codon. Tyr, Trp, Cys, or Phe. 

<220> 

<221> misc.feature 
so <222>(12)..{12) 

<223> The 'Xaa* at location 12 stands for Lys, Asn, Arg, Ser, Thr, lie. Met. Glu. Asp, Gly. Ala. Val, Gin. His. Pro, 
Leu. a stop codon, Tyr, Trp, Cys, or Phe. 

<220> 

55 <22 1 > m isc_feature 

<222>(13)..(13) 

<223> The 'Xaa* at location 1 3 stands for Lys, Asn, Arg, Ser, Thr, lie. Met. Glu, Asp, Gly, Ala, Val, Gin, His, Pro, 
Leu, a stop codon. Tyr, Trp, Cys. or Phe. 
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<220> 

<221> misc.feature 
<222> (14)..(14) 

<223> The 'Xaa' at location 14 stands for Lys, Asn, Arg, Ser, Thr, lie, Met Glu, Asp. Gty, Ala, Val, Gin, His, Pro, 
s Leu, a stop codon, Tyr, Trp. Cys, or Phe. 



<220> 

<223> oligox-SDR2f 
<220> 

10 <221 > misc.feature 

<222> (29)..(43) 

<223> any nucleotide or amino add residue 



15 



20 



<400> 68 



Lys Cys Leu lie Ser Gly Trp Gly Asn Xaa Xaa Xaa Xaa Xaa Thr 
15 10 15 



<210> 69 
<211>55 
<212> DNA 
2S <213> artificial sequence 

<220> 

<223> otigox-SDR2r 
<220> 

<221> misc_feature 
30 <222> (33) . (47) 

<223> any base 
<220> 

<221> misc.feature 
<222> (33)..(47) 
35 <223> any nucleotide 

<400> 69 

catggttcac ggagtagaga ccgaccccgt tgnnnnnnnn nnnnnnntga cgatc 55 



<210>70 
<211>59 
<212> DNA 

<213> artificial sequence 
<220> 

45 <223> primer SDR1 -mutnnb-forward 

<220> 

<221> misc_feature 
<222> (24)..(40) 

<223> N=A, C, G, T; B=C, G. T; V=A. C. G 



so 



<400> 70 

tggtatccgc gggccactgc tacnnbnnbn nbnnbnnbnn baagtcccgg atccaggtg 59 



<210> 71 
55 <211>52 

<212> DNA 

<213> artificial sequence 
<220> 
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<223> primer SDR2-mutnnb-reverse 
<220> 

<221 > misc.feature 
<222> (20)..(33) 
5 <223> N=A, C. G. T; B=C, G, T; V=A. C, G 

<400> 71 

ggcgccagag ctagcagtvn nvnnvnnvnn vnngttgccc cagccagaga tg 52 

10 <210>72 
<211>6 
<212> PRT 

<213> artificial sequence 

15 <220> 

<223> variant g SDR1 
<400> 72 



20 Ala Phe Phe Asn Gly Asp 

1 5 



<210> 73 
25 <211>5 

<212> PRT 

<213> artificial sequence 
<220> 

30 <223> variant g SDR2 

<400> 73 



35 Arg Lys Asp Pro Trp 

1 5 



<210> 74 
40 <211>234 
<212> PRT 

<213> artificial sequence 
<220> 

45 <223> artificial sequence 

<400> 74 



50 



55 
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lie Val GXy Gly 
X 

Ser Leu Asn Ser 

20 

Gin Trp Val Val 
35 

Ser Arg lie Gin 
50 

Gly Asn Glu Gin 
65 

Tyr Asp Arg Lys 

Ser Arg Ala Val 

100 

Ala Pro Pro Ala 

115 

Arg Lys Asp Phe 

130 

Leu Gin Cys Leu 
145 

Ser Tyr Pro Gly 

Glu Gly Gly Lys 

180 

Cys Asn Gly Gin 
195 

Gin Lys Asn Lys 
210 

Trp lie Lys Asn 
225 



Tyr Asn Cys Glu 
5 

Gly Tyr His Phe 

Ser Ala Gly His 

40 

Val Arg Leu Gly 
55 

Phe lie Asn Ala 
70 

Thr Leu Asn Asn 

85 

He Asn Ala Arg 

Thr Gly Thr Lys 

120 

Trp Thr Ala Ser 
135 

Asp Ala Pro Val 
150 

Lys He Thr Ser 
165 

Asp Ser Cys Gin 

Leu Gin Gly Val 

. 200 

Pro Gly Val Tyr 
215 

Thr He Ala Ala 
230 



Glu Asn Ser Val 
10 

Cys Gly Gly Ser 
25 

Cys Tyr Ala Ala 

Glu His Asn He 

60 

Ala Lys He He 
75 

Asp He Met Leu 
90 

Val Ser Thr He 
105 

Cys Leu He Ser 

Ser Gly Ala Asp 

140 

Leu Ser Gin Ala 
155 

Asn Met Phe Cys 
170 

Gly Asp Ser Gly 
185 

Val Ser Trp Gly 

Thr Lys Val Tyr 

220 

Asn Ser 



Pro Tyr Gin Val 
15 

Leu He Asn Glu 
30 

Phe Asn Gly Lys 
45 

Glu Val Leu Glu 

Arg His Pro Gin 

80 

He Lys Leu Ser 
95 

Ser Leu Pro Thr 
110 

Gly Trp Gly Pisn 
125 

Tyr Pro Asp Glu 

Lys Cys Glu Ala 

160 

Val Gly Phe Leu 
175 

Gly Pro Val Val 
190 

Asp Gly Cys Ala 
205 

Asn Tyr Val Lys 



<210> 75 
<211>234 
<212> PRT 

<213> artificial sequence 
<220> 

<223> artificial sequence 
<400> 75 
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lie Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Vai Pro Tyr Gin Val 
15 10 15 

Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu lie Asn Glu 

20 25 30 

Gin Trp Val Val Ser Ala Gly His Cys Tyr Ala Ala Phe Asn Gly Lys 

35 40 45 

Ser Arg lie Gin Val Arg Leu Gly Glu His Asn lie Gly Val Leu Glu 

50 55 60 

Gly Asn Glu Gin Phe lie Asn Ala Ala Lys lie He Arg His Pro Gin 
65 70 75 80 

Tyr Asp Trp Lys Thr Leu Asn Asn Asp He Met Leu He Lys Leu Ser 

85 90 95 

Ser Arg Ala Val He Asn Ala Arg Val Ser Thr He Ser Leu Pro Thr 

100 105 110 

Ala Pro Pro Ala Thr Gly Thr Lys Cys Leu He Ser Gly Trp Gly Asn 

115 120 125 

Arg Lys Asp Phe Trp Thr Ala Ser Ser Gly Ala Asp Phe Pro Asp Glu 

130 135 140 

Leu Gin Cys Leu Asp Ala Pro Val Leu Ser Gin Thr Lys Cys Glu Ala 
145 150 155 160 

Ser Tyr Pro Gly Lys He Thr Ser Asn Met Phe Cys Val Gly Phe Leu 
30 165 170 175 

Glu Gly Gly Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Val Val 

180 165 190 

Arg Asn Gly Gin Leu Gin Gly Val Val Ser Trp Gly Asp Gly Cys Ala 

195 200 205 

Gin Lys Asn Lys Pro Gly Val Tyr Thr Lys Val Tyr Asn Tyr Val Lys 

210 215 220 

Trp He Lys Asn Thr He Ala Ala Asn Ser 
225 230 

<400> 75 

ggcgccagag ctagcagtnn nnnnnnnnnn nnngttgccc cagccagaga tg 52 



25 



35 



40 



45 



SO 



55 



<210>76 
<211>12 
<212> PRT 

<213> artificial sequence 
<220> 

<223> substrate A 
<400> 76 
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Leu Leu Trp Leu Gly Arg Val Val Gly Gly Pro Val 
15 10 



<210> 77 
<211> 12 
<212> PRT 

<213> artificial sequence 
10 <220> 

<223> substrate B 



<400> 77 



Lys Lys Trp Leu Gly Arg Val Pro Gly Gly Pro Val 
15 10 



<210> 78 
20 <211>6 



<212> PRT 

<213> artificial sequence 
<220> 

<223> varianti SDR1 
<400> 78 



30 



<210> 79 
<211>6 
35 <212> PRT 

<213> artificial sequence 
<220> 

<223> variant2 SDR1 
40 <400> 79 



Asp Ala Val Gly Arg Asp 
1 5 



Asn Gly Arg Asp Leu Glu 
1 5 



45 

<210> 80 
<211>6 
<212> PRT 

<213> artificial sequence 
50 <220> 

<223> variants SDR1 

<400> 80 



55 



Gly Phe Val Met Phe Asn 
1 5 
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10 



<210>81 
<211>5 
<212> PRT 

<213> artificial sequence 
<220> 

<223> varianti SDR2 
<400> 81 



Ar9 Val Hxs Pro Ser 
1 5 



IS 



20 



<210> 82 
<211> 5 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant2 SDR2 
<400> 82 



25 



Val Arg Gly Thr Trp 
1 5 



30 



<210> 83 
<211>5 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant3 SDR2 



35 



<400> 83 



Arg Ser Pro Leu Thr 
1 5 



40 



45 



<210> 84 
<211>6 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant a SDR1 



<400>84 



50 



Arg Pro Trp Asp Pro Ser 
1 5 



55 



<210> 85 
<211>6 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant b SDR1 
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<400> 86 



GXy Phe Val Met Phe Asn 
1 5 



10 



<210> 86 
<211>6 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant c SDR1 



15 



<400> 86 



Glu lie Ala Asn Arg Glu 
1 5 



20 



25 



<210> 87 
<211>6 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant d SDR1 



<400> 87 



30 



Lys Ala Val Val Gly Thr 
1 5 



35 



<210> 88 
<211>6 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant e SDR1 



40 



<400>88 



45 



50 



Val Asn He Met Ala Ala 
1 5 



<210> 89 
<211>6 
<212> PRT 

<213> artifidal sequence 
<220> 

<223> variant f SDR1 



<400> 89 



55 



Ala Ala Phe Asn Gly Asp 
1 5 
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<210> 90 
<211>5 
<212> PRT 

<213> artificial sequence 
5 <220> 

<223> variant a SDR2 

<400> 90 

10 

Val His Pro Thr Ser 
1 5 



IS 



20 



<210> 91 
<211>5 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant b SDR2 



<400> 91 



25 



Arg Ser Pro Leu Thr 
1 5 



30 



<210> 92 
<211> 5 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant c SDR2 



35 



<400> 92 



40 



45 



Arg Gly Ala Arg Thr 
1 5 



<210>93 
<211>5 
<212> PRT 

<213> artificial sequence 
<220> 

<223> variant d SDR2 



so 



<400> 93 



Arg Thr Pro lie Ser 
1 5 



ss 



<210>94 
<211>5 
<212> PRT 

<213> artificial sequence 
<220> 
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15 



30 



40 



45 



50 



55 



<223> variant e SDR2 
<400>94 



Thr Thr Ala Arg Lys 
1 5 



<210> 95 
10 <211>5 



<212> PRT 

<213> artificial sequence 
<220> 

<223> variant f SDR2 
<400> 95 



Arg Lys Asp Phe Trp 
1 5 

20 

<210> 96 
<211> 157 
<212> PRT 
25 <213> Homo sapiens 

<400> 96 

Val Arg Ser Ser Ser Arg Thr Pro Ser Asp Lys Pro Val Ala His Val 
1 5 10 15 

Val Ala Asn Pro Gin Ala Glu Gly Gin Leu Gin Trp Leu Asn Arg Arg 

20 25 30 

Ala Asn Ala Leu Leu Ala Asn Gly Val Glu Leu Arg Asp Asn Gin Leu 
35 40 45 

35 Val Val Pro Ser Glu Gly Leu Tyr Leu lie Tyr Ser Gin Val Leu Phe 





50 










55 










60 










Lys 


Gly Gin 


Gly 


Cys 


Pro 


Ser 


Thr 


His 


Val 


Leu 


Leu 


Thr 


His 


Thr 


He 


65 










70 










75 










80 


Ser 


Arg 


He 


Ala 


Val 
85 


Ser 


Tyr 


Gin 


Thr 


Lys 
90 


Val 


Asn 


I<eu 


Leu 


Ser 
95 


Ala 


He 


Lys 


Ser 


Pro 
100 


Cys 


Gin 


Arg 


Glu 


Thr 
105 


Pro 


Glu 


Gly 


Ala 


Glu 
110 


Ala 


Lys 


Pro 


Trp 


Tyr 
115 


Glu 


Pro 


He 


Tyr 


Leu 
120 


Gly 


Gly 


Val 


Phe 


Gin 
125 


Leu 


Glu 


Lys 


Gly 


Asp 
130 


Arg 


Leu 


Ser 


Ala 


Glu 
135 


He 


Asn 


Arg 


Pro 


Asp 
140 


Tyr 


Leu 


Leu 


Phe 


Ala 


Glu 


Ser 


Gly 


Gin 


Val 


Tyr 


Phe 


Gly 


He 


He 


Ala 


Leu 








145 










150 










155 













Claims 

1 . A method for generating a proteolytic enzyme having defined specificity not conferred by the protein scaffold towards 
at least one target substrate comprising at least the following steps: 
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(a) providing a protein scaffold having at least 70% homology to human trypsin I having the amino acid sequence 
shown in SEQ ID NO: 1, which catalyzes at least one chemical reaction on at least one substrate, 

(b) generating a library of proteolytic enzymes or isolated proteolytic enzymes by combining a polynucleotide 
encoding the protein scaffold from step (a) via insertion or substitution with 1 to 1 1 specificity determining regions 

5 (SDRs), wherein the SDRs are fully or partially random synthetic oligonucleotide sequences encoding peptide 

sequences with a length of less than 50 amino acid residues at one or more positions from the group of positions 
within the polynucleotide encoding protein scaffold that correspond structurally or by amino acid sequence 
homology to the regions 18-25, 38-48, 54-63. 73-86, 122-130, 148-156. 165-171 and 194-204 in human trypsin 
I having the amino acid sequence shown in SEQ 10 NO:1 , expressing said enzymes, and 

10 (c) selecting out of the library of proteolytic enzymes generated in step (b) one or more enzymes that have 

defined specificities not conferred by the protein scaffold provided in step (a) towards at least one target substrate, 

2. The method according to claim 1 . wherein the peptide sequences inserted or substituted in step (b) are fully or 
partially random and/or have a length variation; and/or wherein the selection in step (c) is achieved by screening 

IS for enzyme activity and/or enzyme affinity 

(i) under low target substrate concentrations, or 

(ii) by using the target substrate and at least one more substrate in comparison, or 

(iii) by adding in excess other substrates tiian the target substrate, thereby using the added substrates as 
20 competitors, or 

(iv) by adding enzyme inhibitors, or 

(v) by selecting enzymes that preferentially bind to the target substrate and selecting out of this subgroup those 
enzymes that convert the substrate, or 

(vi) any combination thereof. 

2S 

3. The method according to claim 1 , which 
comprises at least the following steps: 

(a) providing a first protein scaffold fragment. 
30 (b) connecting said protein scaffold firagment via a peptide linkage with a first specificity determining region, 

and optionally 

(c) connecting the product of step (b) via a peptide linkage with a further specificity determining region peptide 
or with a further protein scaffold fragment, and optionally 

(d) repeating step (c) for as many cycles as necessary in order to generate a sufficiently specific enzyme, and 
3S (e) selecting out of the population generated in steps (a) - (d) one or more enzymes that have the desired 

specificities toward the one or more target substrates which is not conferred by the protein scaffold fragment 
provided in step (a). 



<o Patentanspriiche 

1 . Verfahren zur Herstellung eines proteolytischen Enzyms mit definierter Spezifitat, die nicht durch das Proteingrund- 
gerOst veriiehen wird. gegenuber mindestens einem Zielsubstrat, das mindestens die folgenden Schritte umfasst: 

4S (a) Bereitstellen eines Proteingrundgerusts, das mindestens 70% Homologie zu menschlichem Trypsin I mit 

der in SEQ ID NR: 1 dargestellten Aminosauresequenz hat und das mindestens eine chemische Reaktion an 
mindestens einem Substrat katalysiert, 

(b) Herstellen einer Bank von proteolytischen Enzymen oder von isolierten proteolytischen Enzymen durch 
Kombinieren eines Polynukleotids, das das Proteingrundgerust aus Schritt (a) kodiert, mittels Insertion oder 

so Substitution mit 1 bis 1 1 spezifitatsbestimmenden Regionen (SDR), wobei die SDR vollstandig oder teilweise 

zufallsgemafle synthetische Oligonukleotidsequenzen sind. die Peptidsequenzen mit einer Lange von weniger 
als 50 Aminosaureresten kodieren. an einer oder mehreren Positionen aus der Gruppe der Positionen innerhalb 
des Polynukleotids, das das Proteingrundgerust kodiert, die strukturell oder anhand von Aminosauresequenz- 
homologie den Regionen 18-25, 38-48, 54-63. 73-86. 122-130, 148-156. 165-171 und 194-204 in menschlichem 

ss Trypsin I mit der in SEQ 10 NR: 1 dargestellten Aminosauresequenz entsprechen, Exprimieren dieser Enzyme 

und 

(c) Selektieren aus der im Schritt (b) hergestellten Bank von proteolytischen Enzymen eines oder mehrerer 
Enzyme mit definierten SpezifitSten, die nicht durch das Im Schritt (a) bereitgestellte Proteingrundgerust ver- 
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liehen werden. gegenOber mindestens einem Zielsubstrat. 

2. Verfahren nach Anspruch 1 , wobei die im Schritt (b) inserierten oder substituierten Peptidsequenzen vollstSndig 
Oder teilweise zufallsgemdll sind und/oder eine Langenvariation aufweisen und/oder wobei die Selektion im Schritt 
s (c) erziett wird mittets Durchmustem im Hinblick auf Enzymaktivitat und/oder Enzymaffinitat 

(i) unter niedrigen Konzentrationen des Zielsubstrats oder 

(il) indem man das Zielsubstrat und mindestens etn weiteres Substrat zum Vergteich verwendet, oder 

(iii) durch Zugabe anderer Substrate als das Zielsubstrat im Uberschuss. wobei die zugefOgten Substrate als 
10 Kompetitoren verwendet werden, oder 

(iv) durch Zugabe von Enzyminhlbitoren oder 

(v) indem man Enzyme selektiert, die t}evorzugt an das Zielsubstrat binden. und aus dieser Untergruppe die- 
jenigen Enzyme selektiert, die das Substrat umwandein, oder 

(vi) durch eine beliebige Kombination davon. 



IS 



3. Verfahren nach Anspruch 1 , das mindestens die folgenden Schritte umfasst: 



(a) Bereitstellen eines ersten Proteingrundgerustfragments, 

(b) Verbinden des Proteingrundgerustfragments uber eine Peptidverknupfung mit einer ersten spezifitdtst>e- 
20 stimmenden Region und gegebenenfalls 

(c) Verbinden des Produkts von Schritt (b) uber eine Peptidverknupfung mit einem weiteren spezifitatsbestim- 
mende*Region-Peptid oder mit einem weiteren Proteingrundgeriistfragment und gegebenenfalls 

(d) wiedertiolen von Schritt (c) so viele Zyklen lang. wie notwendig sind, um ein genugend spezifisches Enzym 
herzustellen. und 

(e) Selektieren aus der in den Schritten (a) - (d) hergestellten Population eines Oder mehrerer Enzyme, die die 
gewunschten Spezifitaten gegenuber dem einen oder den mehreren Zielsubstraten, die nicht durch das im 
Schritt (a) bereitgestellte ProteingrundgerOstfiragment veriiehen werden, aufweisen. 

30 Revendications 

1 . Procddd de production d'une enzyme protdolytique ayant une spdcificitd d^finie, qui n'est pas conf^r^e par r^cha- 
faudage protdique, pour au moins un substrat dble, comprenant au moins les Stapes suivantes : 

35 (a) mise d disposition d'un Schafaudage protSique ayant une homologie d'au moins 70 % avec la trypsine 

humaine I, ayant la sequence d'acides amines prSsentSe dans SEQ ID N** 1, qui catalyse au moins une reaction 
chimique sur au moins un substrat, 

(b) production d'une banque d'enzymes prot^lytiques ou d'enzymes protSolytiques Isoldes . par combinaison 
d'un polynucleotide codant pour I'dchafaudage protSique de V&tape (a), par insertion de 1 d 1 1 regions dSter> 

40 minant la spdcificitS (SDR) ou remplacement par ces demiSres, les SDR dtant des sequences nudtotidiques 

synthStiques, entiSrement ou partiellement alSatoires, codant pour des sequences peptidiques ayant une lon- 
gueur infdrieure d 50 rSsldus d'addes aminds sur une ou plusieurs positions d partir du groupe de positions, d 
rintSrieur de la protSine codant pour le polynud^otide, qui correspondent d'un point de vue structurel, ou par 
une homologie de sequences d'addes amines, aux regions 18-25. 38-48, 54-63. 73-86, 122-130, 148-156, 

45 165-171 et 194-204 d'une trypsine humaine I ayant la sequence d'addes amines prSsentSe dans SEQ ID N"" 

1 , expression desdites enzymes, et 

(c) selection, dans la banque d'enzymes prottolytiques produite dans I'Stape (b) d'une ou plusieurs enzymes 
qui ont des spSdfidtSs ddfinies, qui ne sont pas conferees par I'Schafaudage protdique mis d disposition dans 
I'dtape (a) pour au moins un substrat dble. 

so 

2. ProcSdS selon la revendication 1 , dans lequel les sequences peptidiques insSrSes ou remplacSes dans I'Stape (b) 
sont enti^rement ou partiellement alSatoires et/ou pr6sentent une variation de longueur ; et/ou dans lequel la s^ 
lection de I'Stape (c) est r6alis6e par criblage pour ce qui est de I'activitS et/ou de I'affinite enzymatique, 

ss (i) d de faibles concentrations du substrat dble, ou 

(ii) par utilisation du substrat dble et d'au moins un substrat suppldmentaire d titre de comparaison, ou 

(iii) par addition, en exces. de substrats autres que le substrat dble, de fagon d utiliser en tant que compStiteurs 
les substrats ajoutSs, ou 
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(iv) par addition d'inhibiteurs enzymatiques, ou 

(v) par selection d'enzymes qui se lient d'une mani^re pref^rentielle au substrat dble, et selection, dans ce 
sous-groupe. des enzymes qui oonvertissent le substrat, ou 

(vl) une coinbinaison quelconque des points d-dessus. 

Proc^d^ salon la revendication 1, qui comprend au moins I'une des stapes suivantes : 

(a) mtse d disposition d'un premier fragment d'^chafaudage prot^ique ; 

(b) connexion dudit fragment d'^chafaudage prot^ique, par rinterm^dtaire d'une liaison peptldique, d une pre- 
miere r^ion determinant la spedficite. et, en option 

(c) connexion du produit de retape (b), par rintermedialre d'une liaison peptidique, avec un autre peptide de 
region determinant la specificite ou avec un autre fragment d'echafaudage proteique, et, en option 

(d) repetition de retape (c) pendant autant de cydes que necessaire pour produire une enzyme suffisamment 
spedfique. et 

(e) selection, parmi la population produite dans les etapes (aHd). d'une ou plusieurs enzymes qui presentent 
les specifidtes souhaitees pour le ou les substrats cibles, qui ne sont pas conferees par le fragment d'echa- 
faudage proteique mis e disposition dans I'etape (a). 
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Fig. 1 
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Trypsin 

a-Throfnbin 

Enteropeptidase 



IVGGYNCEBNSVPYQySL NSGYHF-C6GSLINEQWWSAGHCY 

IVEGSDAEZGMSPWQVMLFRKSPQELL'CGASLISDRWVLTAAHCLLYPP 
IVGGSNAKEGAWPWWGL YYGGRLLCGASLVSSDWLVSAAHCVYGRN 



* * 



It* 



* * 



Trypsin 
a'Thrombin 
Cnteropept idase 



KS RI QVRLGEH MI EVLEGN -EQFINAAKI I RHPQYD- RKTL 

WDKNFTENDLLVRIGKH SRTRYERNIEKISMLEKIYIHPRYNWRENL 

LE PSKWTAIIiGLHMKSHLTSPQTV-PRLID — EIVINPHYN-RRRK 

Jk * « « * * 



Trypsin 
a-Xhrombin 
Enteropept idase 



NNDIMI.IKLSSRAVINARVSTISLPTA PPAT GTKCLISGWG 

DRDIALMKLKKPVAFSDYIHPVCLPDR ETAASLLQAGYKGRVTGWG 

DN DI AHMH LEF KVIIYTD YI QP I CLPEENQVFPP GRNCS I AGWG 



Trypsin 
a-Thrombin 
Enteropept idase 



H T ASSGAD YPDELQCLDAP VLSQAKCEAS YPG-KI TS HMFCVGFL 

KLKETWTAl-rVGKGQPSVLQWNLPIVERPVCKDSTRI-RITDHMFCAGYK 

T WYQGTT -AN I LQEAD VP LLSNERCQQQMPE YM I TE NM I CAG YE 

— 3 — 



Trypsin 
a-Thrombin 
Enteropept idase 



-EGGK — DSCQGDSGGPWCNGQ LQ GWSWGDGCAQKHKP 

PDEGKRGDACEGDSGGPFVMKSP FWNRWYQMGIVSMGEGCDRDGKY 

-BGGI — DSCQGDSGGPLMCQENNRWFLA GVTSFGYKCALPNRP 



Trypsin 
a-Thrombin 
Enteropept idase 



GVYTKVYNYVKWIKHTIAAMS- 

GFYTHVFRLKKWIQKVIDQFGE 
GVYARVSRFTEWIQSFLH 



-» * 



Fig. 2 
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Fig. 3 
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sub lAHEYAQSV — PY GISQ — IKAPALHSQGY 

furin VAKRRAKRD — VYQEPTDPKFPQQWYLSGVTQRDLNVKEAWAQGF 

PC_SK1 CKERSKRSALRDS ALNL — FNDPMWNQQWYLQDTW4TAALPKLDL 

PC_SK5 NTHPCQ SO — MHIEGAWKRGY 

1 2 — 



sub TGSNVKVAVIDSGIDSSHPDL — NVRGGAS- -FVPSETN P 

furin TGHGIWSILDDGIEK1JHPDI-AGNYDPGAS--FDVWDQD PDPQ 

PC_SK1 HVIP\WQKGITGKGWITVLDDGl.EWNHTDIYANYDPEASyDFNDNDHD P 

PC_SK5 TGKNIWTl LDDGIERTHPDL MQNYDA — LASCDVNGHDI.DPMP 

♦ *■ *■ A 2 



sub YQ DGSS HGTHVAGTIA--AL-NNSIGVLGVSPSASLYAVKV1.DS 

furin PRYTQM NDHR HGTRCAGEVA- -AVAHHGVCGVGVAYHARIGGVRMLD 

PC_SK1 FPRYDPTNEHK HGTRCAGEI AMQA14-HHKCGV-GVAYMSKVGGIRMLDG 

PC_SK5 RY DASNENKHGTRCAGEVA- -AAAMNSHCTVGI AFNAKIGGVRMLDGDVTD 

...^4—^. ........ «* * 4 * * . 



sub -TGSGQYSWI INGIE-WAISNI«IMDVIHMSLG GPT — GSTA IJ<T- - 

furin GEVTDAVEARS-LGLMPWHIHIYSASW GPEDDGKTVDGPARLAEE- - 

PC_SK1 -IVTDAIEASSIGFN PGHVDI YSASWGPHDDGKTVEGP GRLA QKAFE 

PC_SK5 MVEAKSVSFHPQHVH I Y5ASWGPDDDGKXVD GP A — P LT RQ 

-5 6 7 8- 



sub — WDKAVSSG 1 WAAAAGNEG5S GSTST VGYPAJ< YPST lAVGAV 

furin — AFFRGVSQGRGGLGSIFVWASGNGGREHOSCHCDGYTMSI-YTLSISSATQFGNV 

PC_SK1 YGVKQGROGKG S I FVWA5GNGGRQ GDNCDCD GYTDSIYTISI 

PC_SK5 — AFEHGVRMGRRGLGSVFVWASGNGGRSKDHCSCDGYTNSI-YTISrSSTAESGKKPWY 
8 ♦ 9 



sub — N SSNQR ASFSSAG-SELDVMAPGVSIQSTLPGGTYGAY 

furin — PWYSEACSSTLA TTYSSGNQNEKQIVTTDLRQKCT ESH 

PC_SK1 — S SASQQGLSPWYAEKCSSTLATSYSSG-DYTDQRITSADLHN0CT ETH 

PC_SK5 LEE CSSTL ATTYSSG-ESYDKKI ITTDLRQRCTDNH 

10 • 11 



sub MGTStHATPHVAGAAALIL — SKHP — TWTNAQVRDRLESTATY — LG-MSFYYGKGLIHV" 

furin TGTSASAPLAAGI IALTLEA^^KN^i — TWRDMQHLWQTSKPAH — LN-ADDWATNGVGRK 

PC_SK1 TGTSASAPLAAGIFALAL — EANP — HLTWRDMQHLVVWTSEYDPLA-NHPGWKKMGAGL 

PC_SK5 TGTS ASAPMA AG 1 1 ALAL- -EANPFLTWRDVQHVI VRTS RAGH — LNAND WKTNAAGFK V 



Fig. 4 
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Peps . TLVDEQP LENYLDMEYFGTIGIGTPAQDFTWFOTGSSNLWVPSVYCSSL — ACTN 

Seer . EMVDN LRGKSGQeY YVBMTVGSPPQTLWH.VOTGSSNFAVGAAPHPFlrf 

Cath - PAVTEGP IPE VLKMYMDAQY YGEI GIGTPPQCFTWFDTGSSNLWVP SI HCKLLDI ACW I 

♦ « 2—-—— * * ■* * *♦♦*♦* * ....-2 — 



Peps . HMRFNPEDS STVQSTSETVS ITYGTGSMTGILGYDTVQV G GZ SDTH 

Secx . HRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSI PHGPNVTVRA 

Cath. HHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASALG GVKVER 

_ ..-..3 * « * — — 4... — 



Peps: QIFGLSCTEPGSFLYYAPFDGILGLAYPSIS— 5SGATPVF0NIWNQGLVSQDLFSVYLS 

Seer . NIAAITESDK-FFZNGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVP-NLFSLQLC 
Cath. QVFGEATKQPGITFIAAKFDGILGMAVPRIS— VWaVLPVFDNLMCXiKLVDQNlFSFYLS 



Peps . ADD KS — GSWIFGGIDSSYYTGSLHWVPVTVEGYWQITVDSITMl-lGETI 

Seer . GAGFPLMQS EVLASV — GGSM 1 1 GG I DHS LYTGS LW YTP I RREWY YEVI I VRVE INGQDL 
Cath . RDP DAQPGGELMLGGTDSKYYKGSLSY1.NVTRKAYWQVHLDQVEVASGLT 



Peps. A — CAEGC — QAIVDTGTSLLTGPTSPZAHIQSDIGASENSD GDMVVSCSAI 

Seer . KMDCKE YMY DKSIVDSGTTN LR1.PKKVFEAAVKS IKAASSTEKFPDGFWLGEQI-V-CWQA 

Cath . L — CKEGC — EAIVDTGTSLMVGPVDEVRELQKAIGAVPLIQ GEYMIPCEKV 

* * *** ** « « — 9 — * 



Peps. SSLPDIVFTI MGVQYPVPPSAYILQSEGS CISGFQGMIIVP-TESG 

Seer. GTTPWNIFPVISLYLMGEVT>IQSFRITILPQQYLRPVEDV ATSQDDCYKFAISQSS 

Cath . STLPAITLKL GGKGYKLSPEDYTLKVSQAGKTLCLSGFMGMDIP-PPSG 



Peps. ELWILGDVFIRQYFTVFDRAHNQVGLAPVA 
Seer . TGTVMGAVIt^GFYVVFDRARKRIGFAVSA 
Cath . PLWILGDVFIGRYYTVFDRDNNRVGFAEAA 



Fig. 6 
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01 ML£ADOQGCI EEQGVEDSAN EDSVDAKPDR SSFVPSLFSK KKKMVTMRSI KTTRDRVPTY 



61 QYHMNFEKIXS KCIIimiKNF DKVTGMGVIU) GTDKDASAI^T KCFRSLGFDV IVYNDCSCAK 

1 

121 MQDL.LKKASE EOHTHAACFA CILLSHGEEH VIYGKDGVTP IKDLTAHFRG ORSKTLI£KP 

181 KLFFIQACRG TELODGIQAO SGPINDTDAN PRVKIPVEAD FLFAYSTVPG YYSWRSPGRG 

241 SWFVQALCSI LEEHGKOl-EI MQILTRVHDR VARHFESQSD DPHFHEKKQI PCWSMLTKE 
5 

301 LYFSQ 



Fig. 8 
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Protein scaffold 
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Candidate SDR insertion sites 
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Insertion of 
random SDRs 
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^ Selection for increased 
specificity 
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NBE with intended specificity 
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substrate A substrate B 
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Fig. 13 
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Fig. 14 
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Fig. 15 
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Fig. 16 



132 



• 



EP 1 633 865 B1 



1.0- 



0.8- 



0.6- 



o 



-55 0.4- 



0.2- 



0.0 



proteolytic digestion of TNF 




variantx 
vaiiantxi 
variantxii 
Trypsin 
negative 



50 



100 



150 200 



time [min] 

Fig. 17a 



250 



proteolytic digestion of serum protein 



ci.o^ 
B 

E 
5o.6- 

<D 

CO 

■§0.4-1 
CO 

-^024 



I • 



I '-I 



' I » I 




I— variantx 
'—vaiiantxi 
variantxn 
Trypsin 
negative 



0.0-H- 
0 



50 100 



150 200 250 300 

time [min] 



350 400 



Fig. 17b 



133 



»- 

EP 1 633 865 B1 




0 SO 100 150 200 250 

time [min] 



Fig. 18 



134 



EP 1 633 865 B1 

REFERENCES CITED IN THE DESCRIPTION 

This list of references died by the applicant is for the reader's convenience only. It does not form part of the European 
patent document Even though great care has been taken in compiling the references, errors or omissions cannot be 
excluded and the EPO disclaims all liability in this regard. 



Patent documents cited in the description 



WO 02090300 A2. Briggs [0021] 

WO 9627671 A, Ballinger [0024] 

EP 0304864 W [0024] [0127] [0142] [0150] 

US 5258289 A [0024] 

WO 9621009 A [0024] 

WO 9811 237 A, Duff [0025] 

WO 0142432 A [0026] 

WO 03095670 A [0038] 



WO 9218645 A [0127] [0141] 

EP 02020576 A [0127] 

WO 0212543 A [0142] [0149] [0152] 

WO 0134835 A [0143] [0148] [0161] 

WO 9522625 A [0143] 

WO 9842728 A [0143] 

WO 0124933 A [0150] 

DE 19646372 [0154] 



Non-patent literature cited In the description 

• Janeway, C et al. Immunobiology. Elsevier Science 
Ltd.. Gariand Publishing, 1999 [0004] 

• Handbook of proteolytic enzymes. Academic Press, 
1998 [0009] 

• Perona, J. ; Cralk, C. Protein Saence, 1995. vol. 4, 
337-360 [0009] [0137] 

• Schlechter ; Berger. Biochem. Biophys. Res. Com- 
mun., 1967. vol. 27. 157-162 [0009] [0036] 

• Ding, L et al. Proc, Natl. Acad. Sci. USA, 1995. vol. 
92, 7627-7631 [0009] 

• Coombs, G et al. J. Biol. Chem., 1996, vol. 271, 
4461-4467 [0009] 

• Fersht, A. Enzyme Structure and Mechanism. W. H. 
Freeman and Company, 1995 [0011] 

• Rawllngs, N.D. ; Barrett, A.J. Proteolysis in Cell 
Functions. lOS Press. 1997. 13-21 [0011] 

• Rawlings ; Barrett. Handbook of proteolytic en- 
zymes. Academic Press, 1998 [0011] 

• Rawlings, N.D. ; Barrett, A.J. Methods Enzymol., 
1994, vol. 244, 19-61 [0011] 

• Wyss, M. et al. Biochemical characterization of fun- 
gal phytases (myo-inosltol hexakisphosphate phos- 
phohydrolases): Catalytic properties. Applied & En- 
vironmental Microbiology, 1999, vol. 65, 367-373 
[0016] 

• Bedford, M. R. ; Schuize, H. EXOGENOUS EN- 
2YIVIES FOR PIGS AND POULTRY. Nutrition Re- 
search Reviews, 1998, vol. 1 1 , 91-1 14 [0016] 

• Murphy, T., C. ; Bedford, M. R. ; McCracken, K. J. 
Effect of a range of new xylanases on in vitro viscosity 
and on performance of broiler diets. British Poultry 
Science, 2003, vol. 44. S16-S18 [0016] 

• Dmgs of today. 1997, vol. 33. 641-648 [0018] 

• Heuer L. ; Biumenberg D. Anaesthesist, 2002, vol. 
51, 388 [0018] 

• Verstraete, M. et al. Drugs, 1995, vol. 50, 29-41 
[0018] 



• Hamilton et ai. Expert Opin Pharmacother, 2000, 
vol. 1 (5). 1041-1052 [0019] 

• Fuiani F. et al. Protein Engineering, 2003, vol. 16, 
515-519 [0020] 

• Kurth, T. et al. Biochemistry, 1998, vol. 37, 
1 1434-1 1440 [0022] [0023] 

• Ballinger, M et al. Biochemistry, 1996, vol. 35, 
13579-13585 [0022] 

• Horrevoets et al. J. Biol. Chem., 1993, vol. 268, 
779-782 [0022] 

• Sices, H. ; Kristle, T. Proc. Natl. Acad. Sci. USA, 
1998, vol. 95, 2828-2833 [0022] 

• Corey, M.J. ; Corey, E. Proc. Natl. Acad. Sci. USA, 
1996. vol. 93, 11428-11434 [0023] 

• Hedstrom, L. et ai. Science, 1992, vol. 255, 
1249-1253 [0023] 

• Fersht et al. Methods of producing novel enzymes 
[0026] 

• Aitamirano et ai. Nature, 2000, vol. 403, 617-622 
[0026] 

• Fereht, A. Enzyme Structure and Mechanism, 1 995 
[0037] 

• M. Fischer ; J. Pieiss. Nucl. Add. Res., 2003. vol. 
31. 319-321 [0128] 

• Guex, N. ; Peitsch, IM.C. Electrophoresis, 1 997, vol. 
18. 2714-2723 [0132] 

• Altschul, S.F. ; Gish, W. ; Miller, W. ; Myere, E.W. ; 
LIpman, D.J. J. MoL Biol., 1990. vol. 215, 403-410 
[0134] 

• Ewens, W. ; Grant, G.R. Statistical methods in Bio- 
informatics: an introduction. Springer, 2001 [0134] 

• Turner, R. et al. J. Biol. Chem., 2002, vol. 277, 
33068-33074 [0134] 

• H.iM. Berman et al. Nucleic Acids Research, 2000, 
vol. 28, 235-242 [0135] 

• Murzin A. G. et al. J. Mol. Biol., 1995, vol. 247, 
536-540 [0135] 



135 



EP 1 633 865 B1 



• Orengo et al. Structure, 1 997. vol. 5 (8). 1 093-1 1 08 
[0135] 

• Holm ; Sander. Nud. Acids Res., 1998, vol. 26. 
316-319 [0135] 

• Gibrat ; Made] ; Bryant. Current Opinion in Struc- 
turai Biology. 1996. vol. 6. 377-385 [0135] 

• Perona, J. ; Craik, C. J. Biol. Chem., 1997, vol. 272, 
29987-29990 [0137] 

• Ottesen,M. ; Svendsen,A. Methods Enzymol., 
1998, vol. 19, 199-215 [0138] 

• Seidah, N. ; Chretien, M. Curr. Opin. Biotech,, 1 997, 
vol. 8. 602-607 [0138] 

• Bergeron, F. et al. J. Mol. Endocrin., 2000. vol. 24, 
1-22 [0138] 

• Rawlings, N.D. ; Barrett, A. J. Methods Enzymol., 
1995. vol. 248, 105-120 [0139] 

• Chitpinityol, S. ; Crabbe, MJ. Food Chemistry, 
1998, vol. 61, 395-418 [0139] 

• Gruninger-Leitch, F. et al. J. Biol. Chem., 2002. vol. 
277, 4687-4693 [0139] 

• Wang, W. ; Liang, TC. Biochemistry, 1994, vol. 33. 
14636-14641 [0139] 

• Wu, J. et al. Biochemistry, 1 998, vol. 37. 451 8-4526 
[0139] 

• Pettit, S. et al. J. Biol. Chem., 1991, vol. 266. 
14539-14547 [0139] 

• Kageyama, T. Cell. Mol. Life Set., 2002, vol. 59, 
288-306 [0139] 



• Aguifar, C. F. et al. Adv. Exp. Med. Biol., 1995. vol. 
362,155-166 [0139] 

• Rawlings, N.D. ; Barrett, A.J. Methods EnzymoL, 
1994, vol. 244, 461-486 [0140] 

• Arcoleo, J. ; Greer, J. J. Biol. Chem., 1982, vol. 257. 
10063-10068 [0141] 

• Almeida, R. et al. Biochem. Biophys. Res. Com- 
mun., 1991. vol. 177, 688-695 [0141] 

• Cadwell, R.C. ; Joyce, G.F. PCR methods. AppL, 
1992, vol. 2, 28-33 [0141] 

• Fersht, A.R. Biochemistry, 1 989. 8031 -8036 [0141] 

• Gregoret, L.IM. ; Sauer, R.T. PNAS, 1993. 
4246-4250 [0141] 

• Weiss et al. PNAS, 2000, 8950-8954 [0141] 

• Lesly et al. Methods in Molecular Biology, 1 995. vol. 
37, 265-278 [0142] [0149] 

• Murakami et al. Nature Biotechnology, 2002, vol. 
20, 76-81 [0143] 

• Ostermeier, M. et al. Nature Biotechnology, 1999. 
vol. 17, 1205-1209 [0143] 

• Sambrook et al. Molecular Cloning: A Laboratory 
Manual. Cold Spring Harbor Laboratory, 1989 [0158] 

• Ausubel et al. Current Protocols in Molecular Biolo- 
gy. Wiley interscience. 1987 [0158] 

• Cadwell, R.C ; Joyce, G.F. PCR Methods AppL, 
1992. VOL 2. 28-33 [0161] 



136 



